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Preface to the Fourth Edition 


In the Preface to the first edition of this book, published thirty years ago, 
we wrote that our aim was to help the reader to acquire a ‘reasonable under- 
standing of gauge theories that are being tested by contemporary experiments 
in high-energy physics’; and we stressed that our approach was intended to 
be both practical and accessible. 

We have pursued the same aim and approach in later editions. Shortly 
after the appearance of the first edition, a series of major discoveries at the 
CERN pp collider confirmed the existence of the W and Z bosons, with prop- 
erties predicted by the Glashow-Salam-Weinberg electroweak gauge theory; 
and also provided further support for quantum chromodynamics, or QCD. 
Our second edition followed in 1989, expanded so as to include discussion, 
on the experimental side, of the new results; and, on the theoretical side, a 
fuller treatment of QCD, and an elementary introduction to quantum field 
theory, with limited applications. Subsequently, experiments at LEP and 
other laboratories were precise enough to test the Standard Model beyond 
the first order in perturbation theory (‘tree level’), being sensitive to higher 
order effects (‘loops’). In response, we decided it was appropriate to include 
the basics of ‘one-loop physics’. Together with the existing material on rel- 
ativistic quantum mechanics, and QED, this comprised volume 1 (2003) of 
our two-volume third edition. In a natural division, the non-Abelian gauge 
theories of the Standard Model, QCD and the electroweak theory, formed the 
core of volume 2 (2004). The progress of research on QCD, both theoretical 
and experimental, required new chapters on lattice quantum field theory, and 
on the renormalization group. The discussion of the central topic of sponta- 
neous symmetry breaking was extended, in particular so as to include chiral 
symmetry breaking. 

This new fourth edition retains the two-volume format, which has been 
generally well received, with broadly the same allocation of content as in 
the third edition. The principal new additions are, once again, dictated by 
substantial new experimental results — namely, in the areas of CP violation and 
neutrino oscillations, where great progress was made in the first decade of this 
century. Volume 2 now includes a new chapter devoted to CP violation and 
oscillations in mesonic and neutrino systems. Partly by way of preparation for 
this, volume 1 also contains a new chapter, on Lorentz transformations and 
discrete symmetries. We give a simple do-it-yourself treatment of Lorentz 
transformations of Dirac spinors, which the reader can connect to the group 
theory approach in appendix M of volume 2; the transformation properties of 
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xiv Preface 


bilinear covariants are easily managed. We also introduce Majorana fermions 
at an early stage. This material is suitable for first courses on relativistic 
quantum mechanics, and perhaps should have been included in earlier editions 
(we thank a referee for urging its inclusion now). 

To make room for the new chapter in volume 1, the two introductory 
chapters of the third edition have been condensed into a single one, in the 
knowledge that excellent introductions to the basic facts of particle physics are 
available elsewhere. Otherwise, apart from correcting the known minor errors 
and misprints, the only other changes in volume 1 are some minor improve- 
ments in presentation, and appropriate updates on experimental numbers. 
Volume 2 contains significantly more in the way of updates and additions, as 
will be detailed in the Preface to that volume. But we have continued to omit 
discussion of speculations going beyond the Standard Model; after all, the cru- 
cial symmetry-breaking (Higgs) sector has only now become experimentally 
accessible. 
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The Particles and Forces of the Standard 
Model 


1.1 Introduction: the Standard Model 


The traditional goal of particle physics has been to identify what appear to be 
structureless units of matter and to understand the nature of the forces act- 
ing between them; all other entities are then to be successively constructed as 
composites of these elementary building blocks. The enterprise has a two-fold 
aspect: matter on the one hand, forces on the other. The expectation is that 
the smallest units of matter should interact in the simplest way; or that there 
is a deep connection between the basic units of matter and the basic forces. 
The joint matter /force nature of the enquiry is perfectly illustrated by Thom- 
son’s discovery of the electron and Maxwell’s theory of the electromagnetic 
field, which together mark the birth of modern particle physics. The electron 
was recognized both as the ‘particle of electricity’ — or as we might now say, 
as an elementary source of the electromagnetic field, with its motion consti- 
tuting an electromagnetic current — and also as an important constituent of 
matter. In retrospect, the story of particle physics over the subsequent one 
hundred years or so has consisted in the discovery and study of two new (non- 
electromagnetic) forces — the weak and the strong forces — and in the search 
for ‘electron-figures’ to serve both as constituents of the new layers of matter 
which were uncovered (first nuclei, and then hadrons) and also as sources of 
the new force fields. In the last quarter of the twentieth century, this effort 
culminated in decisive progress: the identification of a collection of matter 
units which are indeed analogous to the electron; and the highly convincing 
experimental verification of theories of the associated strong and weak force 
fields, which incorporate and generalize in a beautiful way the original elec- 
tron/electromagnetic field relationship. These theories are collectively called 
‘the Standard Model’ (or SM for short), to which this book is intended as an 
elementary introduction. 

In brief, the picture is as follows. The matter units are fermions, with 
spin-3 (in units of A). They are of two types, leptons and quarks. Both are 
structureless at the smallest distances currently probed by the highest-energy 
accelerators. The leptons are generalizations of the electron, the term denoting 
particles which, if charged, interact both electromagnetically and weakly; and 
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if neutral, only weakly. By contrast, the quarks — which are the constituents 
of hadrons, and thence of nuclei — interact via all three interactions, strong, 
electromagnetic and weak. The weak and electromagnetic interactions of both 
quarks and leptons are described in a (partially) unified way by the electroweak 
theory of Glashow, Salam and Weinberg (GSW), which is a generalization 
of quantum electrodynamics or QED; the strong interactions of quarks are 
described by quantum chromodynamics or QCD, which is also analogous to 
QED. The similarity with QED lies in the fact that all three interactions are 
types of gauge theories, though realized in different ways. In the first volume 
of this book, we will get as far as QED; QCD and the electroweak theory are 
treated in volume 2. 

The reader will have noticed that the most venerable force of all — gravity 
—is absent from our story. In practical terms this is quite reasonable, since its 
effect is very many orders of magnitude smaller than even the weak force, at 
least until the interparticle separation reaches distances far smaller than those 
we shall be discussing. Conceptually also, gravity still seems to be somewhat 
distinct from the other forces which, as we have already indicated, are encour- 
agingly similar. There are no particular fermionic sources carrying ‘gravity 
charges’: it seems that all matter gravitates. This of course was a motivation 
for Einstein’s geometrical approach to gravity. Despite the lingering promise 
of string theory (Green et al. 1987, Polchinski 1998, Zwiebach 2004), it is 
fair to say that the vision of the unification of all the forces, which possessed 
Einstein, is still some way from realization. Gravitational interactions are not 
part of the SM. 


This book is not intended as a completely self-contained textbook on par- 
ticle physics, which would survey the broad range of observed phenomena and 
outline the main steps by which the picture described here has come to be 
accepted. For this we must refer the reader to other sources (e.g. Perkins 
2000, Bettini 2008). We proceed with a brief review of the matter (fermionic) 
content of the SM. 


EE: SeSe 


1.2 The fermions of the Standard Model 
1.2.1 Leptons 


Forty years after Thomson’s discovery of the electron, the first member of 
another generation of leptons (as it turned out) — the muon — was found inde- 
pendently by Street and Stevenson (1937), and by Anderson and Neddermeyer 
(1937). Following the convention for the electron, the u7 is the particle and 
the u* the antiparticle. At first, the muon was identified with the particle 
postulated by Yukawa only two years earlier (1935) as the field quantum of 
the ‘strong nuclear force field’, the exchange of which between two nucleons 
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would account for their interaction (see section 1.3.2). In particular, its mass 
(105.7 MeV) was nicely within the range predicted by Yukawa. However, ex- 
periments by Conversi et al. (1947) established that the muon could not be 
Yukawa’s quantum since it did not interact strongly; it was therefore a lepton. 
The u~ seems to behave in exactly the same way as the electron, interacting 
only electromagnetically and weakly, with interaction strengths identical to 
those of an electron. 

In 1975 Perl et al. (1975) discovered yet another ‘replicant’ electron, the 
7 with a mass of 1.78 GeV. Once again, the weak and electromagnetic in- 
teractions of the 7™ (7*) are identical to those of the e~ (e*). 

At this stage one might well wonder whether we are faced with a ‘lepton 
spectroscopy’, of which the e~, ~ and 7~ are but the first three states. Yet 
this seems not to be the correct interpretation. First, no other such states have 
(so far) been seen. Second, all these leptons have the same spin (5), which 
is certainly quite unlike any conventional excitation spectrum. And third, 
no y-transitions are observed to occur between the states, though this would 
normally be expected. For example, the branching fraction for the process 


poate +7 (not observed) (1.1) 


is currently quoted as less than 1.2 x 1071! at the 90% confidence level 
(Nakamura et al. 2010). Similarly there are (much less stringent) limits on 
T >p +yandt >e +7. 

If the e7 and u7 states in (1.1) were, in fact, the ground and first excited 
states of some composite system, the decay process (1.1) would be expected 
to occur as an electromagnetic transition, with a relatively high probability 
because of the large energy release. Yet the experimental upper limit on the 
rate is very tiny. In the absence of any mechanism to explain this, one sys- 
tematizes the situation, empirically, by postulating the existence of a selection 
rule forbidding the decay (1.1). In taking this step, it is important to real- 
ize that ‘absolute forbidden-ness’ can never be established experimentally: all 
that can be done is to place a (very small) upper limit on the branching frac- 
tion to the ‘forbidden’ channel, as here. The possibility will always remain 
open that future, more sensitive, experiments will reveal that some processes, 
assumed to be forbidden, are in fact simply extremely rare. 

Of course, such a proposed selection rule would have no physical content if 
it only applied to the one process (1.1); but it turns out to be generally true, 
applying not only to the electromagnetic interaction of the charged leptons, 
but to their weak interactions also. The upshot is that we can consistently 
account for observations (and non-observations) involving e’s, ws and T’s by 
assigning to each a new additive quantum number (called ‘lepton flavour’) 
which is assumed to be conserved. Thus we have electron flavour Le such that 
L.(e~) = 1 and L,(e+) = —1; muon flavour L, such that L, (u7) = 1 and 
L,, (ut) = —1; and tau flavour L, such that L;(7~) = 1 and L,(r+) = —1. 
Each is postulated to be conserved in all leptonic processes. So (1.1) is then 
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forbidden, the left-hand side having Le = 0 and L, = 1, while the right-hand 
side has Le = 1 and L, = 0. 

The electromagnetic interactions of the mu and the tau leptons are the 
same as for the electron. In weak interactions, each charged lepton (e, ju, T) is 
accompanied by its ‘own’ neutral partner, a neutrino. The one emitted with 
the e~ in -decay was originally introduced by Pauli in 1930, as a ‘desperate 
remedy’ to save the conservation laws of four-momentum and angular momen- 
tum. In the Standard Model, the three neutrinos are assigned lepton flavour 
quantum numbers in such a way as to conserve each lepton flavour separately. 
Thus we assign Le = —1,L, = 0,L, = 0 to the neutrino emitted in neutron 
B-decay 

n>pte + De, (1.2) 
since Le = 0 in the initial state and Le(e7) = +1; so the neutrino in (1.2) is an 
antineutrino ‘of electron type’ (or ‘of electron flavour’). The physical reality 
of the antineutrinos emitted in nuclear 6-decay was established by Reines and 
collaborators in 1956 (Cowan et al. 1956), by observing that the antineutrinos 
from a nuclear reactor produced positrons via the inverse -process 


De +p—>n+et. (1.3) 
The neutrino partnering the ~~ appears in the decay of the m~: 
T +p +d, (1.4) 


where the D, is an antineutrino of muon type (L (Pa) = —1, Le(H.) = 0 = 
L,(0,,)). How do we know that D, and De are not the same? An important 
experiment by Danby et al. (1962) provided evidence that they are not. They 
found that the neutrinos accompanying muons from 7-decay always produced 
muons on interacting with matter, never electrons. Thus, for example, the 
lepton flavour conserving reaction 


Htpopt+n (1.5) 
was observed, but the lepton flavour violating reaction 
Dı +p—e+n (not observed) (1.6) 


was not. As with (1.1), ‘non-observation’ of course means, in practice, an 
upper limit on the cross section. Both types of neutrino occur in the -decay 
of the muon itself: 

U —> Vu +E De, (1.7) 
in which L, = 1 is initially carried by the ~ and finally by the v,, and the 
Lee’s of the e~ and De cancel each other out. 

In the same way, the v, is associated with the 7~, and we have arrived at 
three generations of charged and neutral lepton doublets: 


(ve,e ) (Vu, ) and (vr, T) (1.8) 


together with their antiparticles. 
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TABLE 1.1 
Properties of SM leptons. 


Generation Particle Mass (MeV) Q/e Le L, L 


1 Ve < 2x 107° 0 1 0 0 
e7 0.511 -1 1 0 0 
2 Vy < 0.19 0 0 1 0 
uT 105.658 -1 0 1 0 
3 Vr < 18.2 0 0 0 1 
To 1777 -1 0 0 1 


We should at this point note that another type of weak interaction is 
known, in which — for example — the D, in (1.5) scatters elastically from the 
proton, instead of changing into a u*: 


Dy +P > Dp +p. (1.9) 


This is an example of what is called a ‘neutral current’ process, (1.5) being a 
‘charged current’ one. In terms of the Yukawa-like exchange mechanism for 
particle interactions, to be described in the next section, (1.5) proceeds via 
the exchange of charged quanta (WF), while in (1.9) a neutral quantum (Z°) 
is exchanged. 

As well as their flavour, one other property of neutrinos is of great interest, 
namely their mass. As originally postulated by Pauli, the neutrino emitted in 
B-decay had to have very small mass, because the maximum energy carried 
off by the e~ in (1.2) was closely equal to the difference in rest energies of 
the neutron and proton. It was subsequently widely assumed (perhaps largely 
for simplicity) that all neutrinos were strictly massless, and it is fair to say 
that the original Standard Model made this assumption. Yet there is, in fact, 
no convincing reason for this (as there is for the masslessness of the photon 
— see chapter 6), and there is now clear evidence that neutrinos do indeed 
have very small, but non-zero, masses. It turns out that the question of 
neutrino masslessness is directly connected to another one: whether neutrino 
flavour is, in fact, conserved. If neutrinos are massless, as in the original 
Standard Model, neutrinos of different flavour cannot ‘mix’, in the sense of 
quantum-mechanical states; but mixing can occur if neutrinos have mass. The 
phenomenon of neutrino flavour mixing (or ‘neutrino oscillations’) is now well 
established, and is a subject of intense research. In this book we shall simply 
regard non-zero neutrino masses as part of the (updated) Standard Model. 

The SM leptons are listed in table 1.1, along with some relevant properties. 
Note that the limits on the neutrino masses, which are taken from Nakamura 
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et al. 2010, do not include the results obtained from analyses of neutrino 
oscillations. These oscillations, to which we shall return in chapter 21 in 
volume 2, are sensitive to the differences of squared masses of the neutrinos, 
not to the absolute scale of mass. 

We now turn to the other fermions in the SM. 


1.2.2 Quarks 


Quarks are the constituents of hadrons, in which they are bound by the strong 
QCD forces. Hadrons with spins 4, 3, 5, ... (i.e. fermions) are baryons, those 
with spins 0, 1, 2, ... (i.e. bosons) are mesons. Examples of baryons are 
nucleons (the neutron n and the proton p), and hyperons such as A? and the 
£ and = states. Evidence for the composite nature of hadrons accumulated 
during the 1960s and 1970s. Elastic scattering of electrons from protons by 
Hofstadter and co-workers (Hofstadter 1963) showed that the proton was not 
pointlike, but had an approximately exponential distribution of charge with a 
root mean square radius of about 0.8 fm. Much careful experimentation in the 
field of baryon and meson spectroscopy revealed sequences of excited states, 
strongly reminiscent of those well-known in atomic and nuclear physics. 

The conclusion would now seem irresistible that such spectra should be 
interpreted as the energy levels of systems of bound constituents. A spe- 
cific proposal along these lines was made in 1964 by Gell-Mann (1964) and 
Zweig (1964). Though based on somewhat different (and much more frag- 
mentary) evidence, their suggestion has turned out to be essentially correct. 
They proposed that baryons contain three spin-4 constituents called quarks 
(by Gell-Mann), while mesons are quark-antiquark systems. One immediate 
consequence is that quarks have fractional electromagnetic charge. For exam- 
ple, the proton has two quarks of charge +4, called ‘up’ (u) quarks, and one 
quark of charge —ł, the ‘down’ (d) quark. The neutron has the combination 
ddu, while the 7* has one u and one anti-d (d ) and so on. 

Quite simple quantum-mechanical bound state quark models, based on 
these ideas, were remarkably successful in accounting for the observed hadronic 
spectra. Nevertheless, many physicists, in the 1960s and early 1970s, con- 
tinued to regard quarks more as useful devices for systematizing a mass of 
complicated data than as genuine items of physical reality. One reason for 
this scepticism must now be confronted, for it constitutes a major new twist 
in the story of the structure of matter. 

Gell-Mann ended his 1964 paper with the remark: ‘A search for stable 
quarks of charge— or +2 and/or stable di-quarks of charge -2 or +4 or 
+4 at the highest energy accelerators would help to reassure us of the non- 
existence of real quarks’. Indeed, with one possible exception (La Rue et al. 
1977, 1981), this ‘reassurance’ has been handsomely provided! Unlike the 
constituents of atoms and nuclei, quarks have not been observed as stable 
isolated particles. When hadrons of the highest energies currently available 
are smashed into each other, what is observed downstream is only lots more 
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hadrons, not fractionally charged quarks. The explanation for this novel be- 
haviour of quarks is now believed to lie in the nature of the interquark force 
(QCD). We shall briefly discuss this force in section 1.3.6, and treat it in detail 
in volume 2. The consensus at present is that QCD does imply the ‘confine- 
ment’ of quarks — that is, they do not exist as isolated single particles!, only 
as groups confined to hadronic volumes. 

When Gell-Mann and Zweig made their proposal, three types of quark 
were enough to account for the observed hadrons: in addition to the u and 
d quarks, the ‘strange’ quark s was needed to describe the known strange 
particles such as the hyperon A? (uds), and the strange mesons like K°(d8). 
In 1964, Bjorken and Glashow (1964) discussed the possible existence of a 
fourth quark on the basis of quark—lepton symmetry, but a strong theoretical 
argument for the existence of the c (‘charm’) quark, within the framework of 
gauge theories of electroweak interactions, was given by Glashow, Iliopoulos 
and Maiani (1970), as we shall discuss in volume 2. They estimated that 
the c quark mass should lie in the range 3—4 GeV. Subsequently, Gaillard 
and Lee (1974) performed a full (one-loop) calculation in the then newly- 
developed renormalizable electroweak theory, and predicted Mme ~ 1.5 GeV. 
The prediction was spectacularly confirmed in November of the same year with 
the discovery (Aubert et al. 1974, Augustin et al. 1974) of the J/w system, 
which was soon identified as a cé composite (and dubbed ‘charmonium’), with 
a mass in the vicinity of 3 GeV. Subsequently, mesons such as D°(cii) and 
D*(cd) carrying the c quark were identified (Goldhaber et al. 1976, Peruzzi 
et al. 1976), consolidating this identification. 

The second generation of quarks was completed in 1974, with the two 
quark doublets (u, d) and (c, s) in parallel with the lepton doublets (ve, e— ) 
and (v,,, u~). But even before the discovery of the c quark, the possibility that 
a completely new third-generation quark doublet might exist was raised in a 
remarkable paper by Kobayashi and Maskawa (1973). Their analysis focused 
on the problem of incorporating the known violation of CP symmetry (the 
product? of particle-antiparticle conjugation C and parity P) into the quark 
sector of the renormalizable electroweak theory. CP-violation in the decays 
of neutral K-mesons had been discovered by Christenson et al. (1964), and 
Kobayashi and Maskawa pointed out that it was very difficult to construct a 
plausible model of CP-violation in weak transitions of quarks with only two 
generations. They suggested, however, that CP-violation could be naturally 
accommodated by extending the theory to three generations of quarks. Their 
description of CP-violation thus entailed the very bold prediction of two en- 
tirely new and undiscovered quarks, the (t, b) doublet, where t (‘top’) has 
charge 2 and b (‘bottom’) has charge —$. 

In 1975, with the discovery of the 7~ mentioned earlier, there was already 
evidence for a third generation of leptons. The discovery of the b quark 


1With the (fleeting) exception of the t quark, as we shall see in a moment. 
2We shall discuss these symmetries in chapter 4. 
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in 1977 resulted from the observation of massive mesonic states generally 
known as Y (‘upsilon’) (Herb et al. 1977, Innes et al. 1977), which were 
identified as bb composites. Subsequently, b-carrying mesons were found. 
Finally, firm evidence for the expected t quark was obtained by the CDF and 
DO collaborations at Fermilab in 1995 (Abe et al. 1995, Abachi et al. 1995); 
see Bettini 2008, section 4.10, for details about the discovery of the top quark. 
The full complement of three generations of quark doublets is then 


(ud)  (e,s) and (t,b) (1.10) 


together with their antiparticles, in parallel with the three generations of 
lepton doublets (1.8). 

One particular feature of the t quark requires comment. Its mass is so 
large that, although it decays weakly, the energy release is so great that its 
lifetime is some two orders of magnitude shorter than typical strong interaction 
timescales; this means that it decays before any t-carrying hadrons can be 
formed. So when a t quark is produced (in a p-p collision, for example), 
it decays as a free (unbound) particle. Its mass can be determined from a 
kinematic anaysis of the decay products. 

We must now discuss the quantum numbers carried by quarks. First of 
all, each quark listed in (1.10) comes in three varieties, distinguished by a 
quantum number called ‘colour’. It is precisely this quantum number that 
underlies the dynamics of QCD (see section 1.3.6). Colour, in fact, is a kind 
of generalized charge, for the strong QCD interactions. We shall denote the 
three colours of a quark by ‘red’, ‘blue’, and ‘green’. Thus we have the triplet 
(uy , Up , Ug), and similarly for all the other quarks. 

Secondly, quarks carry flavour quantum numbers, like the leptons. In the 
quark case, they are as follows. The two quarks which are familiar in ordinary 
matter, ‘u’ and ‘d’, are an isospin doublet (see chapter 12 in volume 2) with 
Ts = +1/2 for ‘w and T3 = —1/2 for ‘d’. The flavour of ‘s’ is strangeness, 
with the value S = —1. The flavour of ‘c’ is charm, with value C = +1, that 
of ‘b’ has value B = —1 (we use B to distinguish it from baryon number B), 
and the flavour of ‘t’ is T = +1. The convention is that the sign of the flavour 
number is the same as that of the charge. 

The strong and electromagnetic interactions of quarks are independent 
of quark flavour, and depend only on the electromagnetic charge and the 
strong charge, respectively. This means, in particular, that flavour cannot 
change in a strong interaction among hadrons — that is, flavour is conserved 
in such interactions. For example, from a zero strangeness initial state, the 
strong interaction can only produce pairs of strange particles, with cancelling 
strangeness. This is the phenomenon of ‘associated production’, known since 
the early days of strange particle physics in the 1950s. Similar rules hold for 
the other flavours: for example, the t quark, once produced, cannot decay to 
a lighter quark via a strong interaction, since this would violate T-conserva- 
tion. 
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TABLE 1.2 
Properties of SM quarks. 


Generation Particle Mass Qe SC ÈT 
1 Ur Ub Ug 1.7 to 3.1 MeV 2/3 0 0 0 O 

dr de dg 4.1 to 5.7 MeV -1⁄3 0 0 0 O 

2 Cr Cb Cg 1.15 to 1.35 GeV 2/3 0 1 0 0 

Sr Sb S 80 to 130 MeV -1/3 -1 0 0 0 

3 tr tb tg 172 to 174 GeV 2/3 0 0 0 1 

br bp bg 4 to 5 GeV -1/3 0 0 -1 0 


In weak interactions, by contrast, quark flavour is generally not conserved. 
For example, in the semi-leptonic decay 


A°(uds) + p(uud) + e7 + De, (1.11) 


an s quark changes into a u quark. The rather complicated flavour structure 
of weak interactions, which remains an active field of study, will be reviewed 
when we come to the GSW theory in volume 2. However, one very important, 
though technical, point must be made about the weak interactions of quarks 
and leptons. It is natural to wonder whether a new generation of quarks 
might appear, unaccompanied by the corresponding leptons — or vice versa. 
Within the framework of the Standard Model interactions, the answer is no. 
It turns out that subtle quantum field theory effects called ‘anomalies’, to be 
discussed in chapter 18 of volume 2, would spoil the renormalizability of the 
weak interactions (see section 1.4.1), unless there are equal numbers of quark 
and lepton generations. 

We end this section with some comments about the quark masses; the 
values listed in Table 1.2 are based on those given in Nakamura et al. (2010). 
As we have already noted, the t quark is the only one whose mass can be 
directly measured. All the others are (it would appear) permanently confined 
inside hadrons. It is therefore not immediately obvious how to define — and 
measure — their masses. In a more familiar bound state problem, such as a 
nucleus, the masses of the constituents are those we measure when they are 
free of the nuclear binding forces — i.e. when they are far apart. For the QCD 
force, the situation is very different. There it turns out that the force is very 
weak at short distances, a property called asymptotic freedom — see section 
1.3.6; this important property will be treated in section 15.3 of volume 2. We 
may think of the force as very roughly analogous to that of a spring joining two 
constituents. To separate them, energy must be supplied to the system. So 
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when the constituents are no longer close, the energy of the system is greater 
than the sum of the short distance (free) quark masses. In potential models 
(see section 1.3.6), the effect is least pronounced for the ‘heavy’ quarks (mq 
greater than about 1 GeV). For example, the ground state of the Y(bb) lies at 
about 9.46 GeV, which is close to the average value of 2m, as given in Table 
1.2. For ~(ct) the ground state is at about 3 GeV, somewhat greater than 
2m,. For the three lightest quarks, and especially for the u and d quarks, the 
position is quite different: for example, the proton (uud) with a mass of 938 
MeV is far more massive than 2m, + mg. Here the ‘spring’ is responsible for 
about 300 MeV per quark. 

While this picture is qualitatively useful, it is clearly model dependent, 
as would be even a more sophisticated quark model. To do the job properly, 
we have to go to the actual QCD Lagrangian, and use it to calculate the 
hadron masses with the Lagrangian masses as input. This can be done through 
a lattice simulation of the field theory, as will be described in chapter 16 
of volume 2. Independently, another handle on the Lagrangian masses is 
provided by the fact that the QCD Lagrangian has an extra symmetry (‘chiral 
symmetry’) which is exact when the quark masses are zero. This is, in fact, 
an excellent approximation for the u and d quarks, and a fair one for the 
s quark. The symmetry is, however, dynamically (‘spontaneously’) broken 
by QCD, in such a way as to generate (in the case Mu = mq = 0) the 
nucleon mass entirely dynamically, along with a massless pion. The small 
Lagrangian masses can then be treated perturbatively in a procedure called 
‘chiral perturbation theory’. These essential features of QCD will be treated 
in chapter 18 of volume 2. For the moment, we accept the values in Table 1.2; 
Nakamura et al. (2010) contains a review of quark masses. 


EE: SeSe 


1.3 Particle interactions in the Standard Model 
1.3.1 Classical and quantum fields 


In the world of the classical physicist, matter and force were clearly separated. 
The nature of matter was intuitive, based on everyday macroscopic experience; 
force, however, was more problematical. Contact forces between bodies were 
easy to understand, but forces which seemed capable of acting at a distance 
caused difficulties. 


That gravity should be innate, inherent and essential to matter, so 
that one body can act upon another at a distance, through a vacuum, 
without the mediation of anything else, by and through which action 
and force may be conveyed from one to the other, is to me so great 
an absurdity, that I believe no man who has in philosophical matters 
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a competent faculty of thinking can ever fall into it. (Letter from 
Newton to Bentley) 


Newton could find no satisfactory mechanism or physical model, for the trans- 
mission of the gravitational force between two distant bodies; but his dynam- 
ical equations provided a powerful predictive framework, given the (unex- 
plained) gravitational force law; and this eventually satisfied most people. 

The 19th century saw the precise formulation of the more intricate force 
laws of electromagnetism. Here too the distaste for action-at-a-distance the- 
ories led to numerous mechanical or fluid mechanical models of the way elec- 
tromagnetic forces (and light) are transmitted. Maxwell made brilliant use 
of such models as he struggled to give physical and mathematical substance 
to Faraday’s empirical ideas about lines of force. Maxwell’s equations were 
indeed widely regarded as describing the mechanical motion of the ether — an 
amazing medium, composed of vortices, gear wheels, idler wheels and so on. 
But in his 1864 paper, the third and final one of the series on lines of force 
and the electromagnetic field, Maxwell himself appeared ready to throw away 
the mechanical scaffolding and let the finished structure of the field equations 
stand on its own. Later these field equations were derived from a Lagrangian 
(see chapter 7), and many physicists came to agree with Poincaré that this 
‘generalized mechanics’ was more satisfactory than a multitude of different 
ether models; after all, the same mathematical equations can describe, when 
suitably interpreted, systems of masses, springs and dampers, or of induc- 
tors, capacitors and resistors. With this step, the concepts of mechanics were 
enlarged to include a new fundamental entity, the electromagnetic field. 

The action-at-a-distance dilemma was solved, since the electromagnetic 
field permeates all of space surrounding charged or magnetic bodies, responds 
locally to them, and itself acts on other distant bodies, propagating the action 
to them at the speed of light: for Maxwell’s theory, besides unifying electricity 
and magnetism, also predicted the existence of electromagnetic waves which 
should travel with the speed of light, as was confirmed by Hertz in 1888. 
Indeed, light was a form of electromagnetic wave. 

Maxwell published his equations for the dynamics of the electromagnetic 
field (Maxwell 1864) some forty years before Einstein’s 1905 paper introducing 
special relativity. But Maxwell’s equations are fully consistent with relativ- 
ity as they stand (see chapter 2), and thus constitute the first relativistic 
(classical) field theory. The Maxwell Lagrangian lives on, as part of QED. 

It seems almost to be implied by the local field concept, and the desire to 
avoid action at a distance, that the fundamental carriers of electricity should 
themselves be point-like, so that the field does not, for example, have to 
interact with different parts of an electron simultaneously. Thus the point- 
like nature of elementary matter units seems intuitively to be tied to the local 
nature of the force field via which they interact. 

Very soon after the successes of classical field physics, however, another 
world began to make its appearance — the quantum one. First the photoelec- 
tric effect and then — much later — the Compton effect showed unmistakeably 
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that electromagnetic waves somehow also had a particle-like aspect, the pho- 
ton. At about the same time, the intuitive understanding of the nature of 
matter began to fail as well: supposedly particle-like things, like electrons, 
displayed wave-like properties (interference and diffraction). Thus the con- 
ceptual distinction between matter and forces, or between particle and field, 
was no longer so clear. On the one hand, electromagnetic forces, treated in 
terms of fields, now had a particle aspect; and on the other hand, particles 
now had a wave-like or field aspect. ‘Electrons’, writes Feynman (1965a) at 
the beginning of volume 3 of his Lectures on Physics, ‘behave just like light’. 

How can we build a theory of electrons and photons which does justice to 
all the ‘point-like’, ‘local’, ‘wave/particle’ ideas just discussed? Consider the 
apparently quite simple process of spontaneous decay of an excited atomic 
state in which a photon is emitted: 


A > A+. (1.12) 


Ordinary non-relativistic quantum mechanics cannot provide a first-principles 
account of this process, because the degrees of freedom it normally discusses 
are those of the ‘matter’ units alone — that is, in this example, the electronic 
degrees of freedom. However, it is clear that something has changed radi- 
cally in the field degrees of freedom. On the left-hand side, the matter is in 
an excited state and the electromagnetic field is somehow not manifest; on 
the right, the matter has made a transition to a lower-energy state and the 
energy difference has gone into creating a quantum of electromagnetic radia- 
tion. What is needed here is a quantum theory of the electromagnetic field — 
a quantum field theory. 

Quantum field theory — or qft for short — is the fundamental formal and 
conceptual framework of the Standard Model. An important purpose of this 
book is to make this core twentieth century formalism more generally accessi- 
ble. In chapter 5 we give a step-by-step introduction to qft. We shall see that 
a free classical field — which has infinitely many degrees of freedom — can be 
thought of as mathematically analogous to a vibrating solid (which has merely 
a very large number). The way this works mathematically is that the Fourier 
components of the field act like independent harmonic oscillators, just like the 
vibrational ‘normal modes’ of the solid. When quantum mechanics is applied 
to this system, the energy eigenstates of each oscillator are quantized in the 
familiar way, as (ny + 1/2)hw, for each oscillator of frequency wp: we say that 
such states contain ‘n, quanta of frequency wr’. The state of the entire field 
is characterized by how many quanta of each frequency are present. These 
‘excitation quanta’ are the particle aspect of the field. In the ground state 
there are no excitations present — no field quanta — and so that is the vacuum 
state of the field. 

In the case of the electromagnetic field, these quanta are of course photons 
(for the solid, they are phonons). In the process (1.12) the electromagnetic 
field was originally in its ground (no photon) state, and was raised finally to an 
excited state by the transfer of energy from the electronic degrees of freedom. 
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The final excited field state is defined by the presence of one quantum (photon) 
of the appropriate energy. 

We obviously cannot stop here (‘Electrons behave just like light’). All the 
particles of the SM must be described as excitation quanta of the correspond- 
ing quantum fields. But of course Feynman was somewhat overstating the 
case. The quanta of the electromagnetic field are bosons, and there is no limit 
on the number of them that can occupy a single quantum state. By contrast, 
the quanta of the electron field, for example, must be fermions, obeying the 
exclusion principle. In chapter 7 we shall see what modifications to the quan- 
tization procedure this requires. We must also introduce interactions between 
the excitation quanta, or equivalently between the quantum fields. This we 
do in chapter 6 for bosonic fields, and in chapter 7 for the Dirac and Maxwell 
fields thereby arriving at QED, our first quantum gauge field theory of the 
SM. 

One reason the Lagrangian formulation of classical field (or particle) physics 
is so powerful is that symmetries can be efficiently incorporated, and their con- 
nection with conservation laws easily exhibited. The same is even more true 
in qft. For example, only in qft can the symmetry corresponding to electric 
charge conservation be simply understood. Indeed, all the quantum gauge 
field theories of the SM are deeply related to symmetries, as will become clear 
in the subsequent development. 

In some cases, however, the symmetry — though manifest in the Lagrangian 
— is not visible in the usual empirical ways (conservation laws, particle multi- 
plets, and so on). Instead, it is ‘spontaneously (or dynamically) broken’. This 
phenomenon plays a crucial role in both QCD and the GSW theory. An aid to 
understanding it physically is provided by the analogy between the vacuum 
state of an interacting qft and the ground state of an interacting quantum 
many-body system — an insight due to Nambu (1960). We give an extended 
discussion of spontaneously broken symmetry in Part VII of volume 2. We 
shall see how the neutral bosonic (Bogoliubov) superfluid, and the charged 
fermionic (BCS) superconductor, offer instructive working models of dynami- 
cal symmetry breaking, relevant to chiral symmetry breaking in QCD, and to 
the generation of gauge boson masses in the GSW theory. 

The road ahead is a long one, and we begin our journey at a more descrip- 
tive and pictorial level, making essential use of Yukawa’s remarkable insight 
into the quantum nature of force. In due course, in chapter 6, we shall be- 
gin to see how qft supplies the precise mathematical formulae associated with 
such pictures. 


1.3.2 The Yukawa theory of force as virtual quantum 
exchange 


Yukawa’s revolutionary paper (Yukawa 1935) proposed a theory of the strong 
interaction between a proton and a neutron, and also considered its possible 
extension to neutron 6-decay. He built his theory by analogy with electromag- 
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netism, postulating a new field of force with an associated new field quantum, 
analogous to the photon. In doing so, he showed with particular clarity how, 
in quantum field theory, particles interact by exchanging virtual quanta, which 
mediate the force. 

Before proceeding, we should emphasize that we are not presenting Yukawa’s 
ideas as a viable candidate theory of strong and weak interactions. Crucially, 
Yukawa assumed that the nucleons and his quantum (later identified with the 
pion) were point-like, but in fact both nucleons and pions are quark compos- 
ites with spatial extension. The true ‘strong’ interaction relates to the quarks, 
as we shall see in section 1.3.6. There are also other details of his theory which 
were (we now know) mistaken, as we shall discuss. Yet his approach was pro- 
found, and — as happens often in physics — even though the initial application 
was ultimately superseded, the ideas have broad and lasting validity. 

Yukawa began by considering what kind of static potential might describe 
the n-p interaction. It was known that this interaction decreased rapidly 
for interparticle separation r > 2 fm. Hence, the potential could not be of 
coulombic type œx 1/r. Instead, Yukawa postulated an n-p potential energy 
of the form 
ie a ls 
m ro 


U(r) (1.13) 
where ‘gn’ is a constant analogous to the electric charge e, r = |r| and ‘a’ is 
a range parameter (~ 2 fm). This static potential satisfies the equation 


(v — =) U(r) = g4ô(r) (1.14) 


(see appendix G) showing that it may be interpreted as the mutual potential 
energy of one point-like test nucleon of ‘strong charge’ gn due to the presence 
of another point-like nucleon of equal charge gn at the origin, a distance r 
away. Equation (1.14) should be thought of as a finite range analogue of 
Poisson’s equation in electrostatics (equation (G.3)) 


V*V(r) = -(r)/€0, (1.15) 


the delta function in (1.14) (see appendix E) expressing the fact that the 
‘strong charge density’ acting as the source of the field is all concentrated into 
a single point, at the origin. 

Yukawa now sought to generalize (1.14) to the non-static case, so as to 
obtain a field equation for U(r,t). For r 4 0, he proposed the free-space 
equation (we shall keep factors of c and fi explicit for the moment) 


a 


E 1 


which is certainly relativistically invariant (see appendix D). Thus far, U is 
still a classical field. Now Yukawa took the decisive step of treating U quantum 
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mechanically, by looking for a (de Broglie-type) propagating wave solution of 
(1.16), namely 


U x exp(ip: r/h —iEt/h). (1.17) 
Inserting (1.17) into (1.16) one finds 
E? 2 1 
i (1.18) 


or, taking the positive square root, 
2271/2 
ch 
E = jer + -5 A 


Comparing this with the standard E-p relation for a massive particle in spe- 
cial relativity (appendix D), the fundamental conclusion is reached that the 
quantum of the finite-range force field U has a mass my given by 
242 
94 Ch h 
= o =—, 1.19 
myc 72 r i ea ( ) 
This means that the range parameter in (1.13) is related to the mass of the 
quantum my by 
h 


myc 


a= (1.20) 
Inserting a ~ 2 fm gives my ~% 100 MeV, Yukawa’s famous prediction for the 
mass of the nuclear force quantum. 

Next, Yukawa envisaged that the U-quantum would be emitted in the 
transition n > p, via a process analogous to (1.12): 


n—>p+U7 (1.21) 


where charge conservation determines the U~ charge. Yet there is an obvious 
difference between (1.21) and (1.12): (1.21) violates energy conservation since 
My <Mp+my if my ~ 100 MeV, so it cannot occur as a real emission process. 
However, Yukawa noted that if (1.21) were combined with the inverse process 


p+U->n (1.22) 


then an n-p interaction could take place by the mechanism shown in fig- 
ure 1.1(a); namely, by the emission and subsequent absorption — that is, by 
the exchange — of a UT quantum. He also included the corresponding U+ 
exchange, where UT is the antiparticle of the UT, as shown in figure 1.1(b). 

An energy-violating transition such as (1.21) is known as a ‘virtual’ transi- 
tion in quantum mechanics. Such transitions are routinely present in quantum- 
mechanical time-dependent perturbation theory and can be understood in 
terms of an ‘energy—time uncertainty relation’ 


AEAt > h/2. (1.23) 
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FIGURE 1.1 
Yukawa’s single-U exchange mechanism for the n-p interaction. (a) UT ex- 
change. (b) U+ exchange. 


The relation (1.23) may be interpreted as follows (we abridge the careful 
discussion in section 44 of Landau and Lifshitz (1977)). Imagine an ‘energy- 
measuring device’ set up to measure the energy of a quantum system. To do 
this, the device must interact with the quantum system for a certain length of 
time At. If the energy of a sequence of identically prepared quantum systems 
is measured, only in the limit At —> oo will the same energy be obtained 
each time. For finite At, the measured energies will necessarily fluctuate by 
an amount AF as given by (1.23); in particular, the shorter the time over 
which the energy measurement takes place, the larger the fluctuations in the 
measured energy. 

Wick (1938) applied (1.23) to Yukawa’s theory, and thereby shed new light 
on the relation (1.20). Suppose a device is set up capable of checking to see 
whether energy is, in fact, conserved while the U~ crosses over in figure 1.1. 
The crossing time t must be at least r/c, where r is the distance apart of the 
nucleons. However, the device must be capable of operating on a time scale 
smaller than t (otherwise it will not be in a position to detect the U*), but 
it need not be very much less than this. Thus the energy uncertainty in the 
reading by the device will be? 


Agi (1.24) 


F 


As r decreases, the uncertainty AF in the measured energy increases. If we 


3In this kind of argument, the ‘~’ sign should be understood as meaning that numerical 
factors of order 1 (such as 2 or 7) are not important. The coincidence between (1.25) and 
(1.20) should not be taken too literally. Nevertheless, the physics of (1.25) is qualitatively 
correct. 
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FIGURE 1.2 
Scattering by a static point-like U-source. 


require AFE = myc’, then 
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MUC 


re (1.25) 
just as in (1.20). The ‘r’ in (1.25) is the extent of the separation allowed 
between the n and the p, such that — in the time available — the USF can 
‘borrow’ the necessary energy to come into existence and cross from one to 
the other. In this sense, r is the effective range of the associated force, as in 
(1.20). 

Despite the similarity to virtual intermediate states in ordinary quantum 
mechanics, the Yukawa-—Wick process is nevertheless truly revolutionary be- 
cause it postulated an energy fluctuation AF great enough to create an as yet 
unseen new particle, a new state of matter. 

We proceed to explore further aspects of Yukawa’s force mechanism. The 
reader should note that throughout the remainder of this book we shall gener- 
ally (unless otherwise stated) use units such that h = c = 1: see Appendix B. 


1.3.3 The one-quantum exchange amplitude 


Consider a particle, carrying ‘strong charge’ gn, being scattered by an in- 
finitely massive (static) point-like U-source also of ‘charge’ gn as pictured in 
figure 1.2. From the previous section, we know that the potential energy in 
the Schrödinger equation for the scattered particle is precisely the U(r) from 
(1.13). Treating this to its lowest order in U(r) (‘Born Approximation’ — see 
appendix H), the scattering amplitude is proportional to the Fourier transform 
of U(r): 


f(a) = ferut) dr (1.26) 


where q is the momentum (or wavevector, since h = 1) transfer q = k — k’. 
The transform is evaluated in appendix G equation (G.24), or in problem 1.1, 
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with the result 
2 


_ ___ IN 
f(a) = T (1.27) 


This implies that the amplitude (in this static case) for the one-U exchange 
amplitude is proportional to —1/ (q? +m), where q is the momentum carried 
by the U-quantum. 

In this scattering by an infinitely massive source of potential, the energy 
of the scattered particle cannot change. In a real scattering process such as 
that in figure 1.1, both energy and momentum can be transferred by the U- 
quantum - that is, q is replaced by the four-momentum q = (qo, q), where 
qo = ko — kj. Then, as indicated in appendix G, the factor —1/(q? + mj,) is 
replaced by 1/(q? — mz,) and the amplitude for figure 1.1 is, in this model, 


2 
a SNe (1.28) 


It will be the main burden of chapters 5 and 6 to demonstrate just how 
this formula is arrived at, using the formalism of quantum field theory. In 
particular, we shall see in detail how the propagator (q? — mj,)~* arises. For 
the present, we can already note (from appendix G) that such propagators 


are, in fact, momentum-space Green functions. 

In chapter 6 we shall also discuss other aspects of the physical meaning of 
the propagator, and we shall see how diagrams which we have begun to draw 
in a merely descriptive way become true ‘Feynman diagrams’, each diagram 
representing by a precise mathematical correspondence a specific expression 
for a quantum amplitude, as calculated in perturbation theory. The expansion 
parameter of this perturbation theory is the dimensionless number gz,/47 
appearing in the potential U(r) (cf (1.13)). In terms of Feynman diagrams, 
we shall learn in chapter 6 that one power of gy is to be associated with each 
‘vertex’ at which a U-quantum is emitted or absorbed. Thus successive terms 
in the perturbation expansion correspond to exchanges of more and more 
quanta. Quantities such as gn are called ‘coupling strengths’, or ‘coupling 
constants’. 

It is not too early to emphasize one very important point to the reader: true 
Feynman diagrams are representations of momentum—space amplitudes. They 
are not representations of space-time processes: all space-time points are 
integrated over in arriving at the formula represented by a Feynman diagram. 
In particular, the two ‘intuitive’ diagrams of figure 1.1, which carry an implied 
‘time-ordering’ (with time increasing to the right), are both included in a single 
Feynman diagram with propagator (1.28), as we shall see in detail (for an 
analogous case) in section 7.1. 

We now indicate how these general ideas of Yukawa apply to the actual 
interactions of quarks and leptons. 
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FIGURE 1.3 
One photon exchange mechanism between charged leptons. 


1.3.4 Electromagnetic interactions 


From the foregoing viewpoint, electromagnetic interactions are essentially a 
special case of Yukawa’s picture, in which gł is replaced by the appropriate 
electromagnetic charges, and my — my = 0 so that a —> oo and the potential 
(1.13) returns to the Coulomb one, —e?/4rr. A typical one-photon exchange 
scattering process is shown in figure 1.3, for which the generic amplitude (1.28) 
becomes 

ee. (1.29) 


Note that we have drawn the photon line ‘vertically’, consistent with the 
fact that both time-orderings of the type shown in figure 1.1 are included in 
(1.29). In the case of electromagnetic interactions, the coupling strength is e 
and the expansion parameter of perturbation theory is e?/4r = a ~ 1/137 
(see appendix C). 

We can immediately use (1.29) to understand the famous ~ sin™4 0/2 an- 
gular variation of Rutherford scattering. Treating the target muon as infinitely 
heavy (so as to simplify the kinematics), the electron scatters elastically so 
that qo = 0 and q? = —(k — k’)? where k and k’ are the incident and fi- 
nal electron momenta. So q? = —2k?°(1 — cos 0) = —4k? sin? 0/2 where we 
have used the elastic scattering condition k? = k”. Inserting this into (1.29) 
and remembering that the cross section is proportional to the square of the 
amplitude (appendix H) we obtain the distribution sin™4 0/2. Thus, such a 
distribution is a clear signature that the scattering is proceeding via the ex- 
change of a massless quantum. 

Unfortunately, the detailed implementation of these ideas to the electro- 
magnetic interactions of quarks and leptons is complicated, because the elec- 
tromagnetic potentials are the components of a 4-vector (see chapter 2), rather 
than a scalar as in (1.29), and the quarks and leptons all have spin-4, necessi- 
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FIGURE 1.4 
Yukawa’s U-exchange mechanism for neutron /-decay. 


tating the use of the Dirac equation (chapter 3). Nevertheless, (1.29) remains 
the essential ‘core’ of electromagnetic amplitudes. 

As far as the electromagnetic field is concerned, its 4-vector nature is ac- 
tually a fundamental feature, having to do with a symmetry called gauge 
invariance, or (better) local phase invariance. As we shall see in chapters 2 
and 7, the form of the electromagnetic interaction is very strongly constrained 
by this symmetry. In fact, turning the argument around, one can (almost) 
understand the necessity of electromagnetic interactions as being due to the re- 
quirement of gauge invariance. Most significantly, we shall see in section 7.3.1 
how the masslessness of the photon is also related to gauge invariance. 

In chapter 8 a number of elementary electromagnetic processes will be fully 
analysed, and in chapter 11 we shall discuss higher-order corrections in QED. 


1.3.5 Weak interactions 


In a bold extension of his ‘strong force’ idea, Yukawa extended his theory 
to describe neutron -decay as well, via the hypothesized process shown in 
figure 1.4 (here and in figure 1.5 we revert to the more intuitive ‘time-ordered’ 
picture — the reader may supply the diagrams corresponding to the other time- 
ordering). As indicated on the diagram, Yukawa assigned the strong charge 
gn at the n-p end, and a different ‘weak’ charge g’ at the lepton end. Thus 
the same quantum mediated both strong and weak transitions, and he had 
an embryonic ‘unified theory’ of strong and weak processes! If we take UT 
to be the m~, Yukawa’s mechanism predicts the existence of the weak decay 
T Ee +2. 

This decay does indeed occur, though at a much smaller rate than the main 
mode which is 7~ —> u~ +P. But — apart from the now familiar problem with 
the compositeness of the nucleons and pions — this kind of unification is not 
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FIGURE 1.5 
(a) 6-decay and (b) et emission at the quark level, mediated by WF. 


chosen by Nature. Not unreasonably in 1935, Yukawa was assuming that the 
range ~ mg’ of the strong force in n-p scattering (figure 1.1) was the same 
as that of the weak force in neutron 8-decay (figure 1.4); after all, the latter 
(and more especially positron emission) was viewed as a nuclear process. But 
this is now known not to be the case: in fact, the range of the weak force 
is much smaller than nuclear dimensions — or, equivalently (see (1.19)), the 
masses of the mediating quanta are much greater than that of the pion. 

B-decay is now understood as occurring at the quark level via the W7- 
exchange process shown in figure 1.5(a). Similarly, positron emission proceeds 
via figure 1.5(b). Other ‘charged current’ processes all involve W+-exchange, 
generalized appropriately to include flavour mixing effects (see volume 2). 
‘Neutral current’ processes involve exchange of the Z°-quantum; an example 
is given in figure 1.6. The quanta WF, Z° therefore mediate these weak inter- 
actions as does the photon for the electromagnetic one. Like the photon, the 
W and Z fields are the quanta of 4-vector fields‘and have spin 1, but unlike the 
photon, the masses of the W and Z are far from zero — in fact Mw ~ 80 GeV 
and Mz ~ 91 GeV. So the range of the force is ~ Mọ ~ 2.5 x 10718 m, much 
less than typical nuclear dimensions (~ few x10713 m). This, indeed, is one 
way of understanding why the weak interactions appear to be so weak: this 
range is so tiny that only a small part of the hadronic volume is affected. 

Thus Nature has not chosen to unify the strong and weak forces via a 
common mediating quantum. Instead, it has turned out that the weak and 
strong forces (see section 1.3.6) are both gauge theories, generalizations of 
electromagnetism, as will be discussed in volume 2. This raises the possibility 
that it may be possible to ‘unify’ all three forces. 


“This is dictated by the phenomenology of weak interactions — see chapter 20 in volume 
2. 
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FIGURE 1.6 
Z°-exchange process. 


Some initial idea of how this works in the ‘electroweak’ case may be gained 
by considering the amplitude for figure 1.5(a) in the low —q? limit. In a 
simplified version analogous to (1.29) which ignores the spin of the W and of 
the leptons, this amplitude is 


9 / (P — My) (1.30) 


where g is a ‘weak charge’ associated with W-emission and absorption. In 
actual 6-decay, the square of the 4-momentum transfer q? is tiny compared to 
M¥,, so that (1.30) becomes independent of q? and takes the constant value 
—g’/M¥,. This corresponds, in configuration space, to a point-like interaction 
(the Fourier transform of a delta function is a constant). Just such a point- 
like interaction, shown in figure 1.7, had been postulated by Fermi (1934a, b) 
in the first theory of 6-decay: it is a ‘four-fermion’ interaction with strength 
Gp. The value of Gg can be determined from measured -decay rates. The 
dimensions of Gp turn out to be energy x volume, so that Gp/(hc)? has 
dimension (energy~7). In our units h = c = 1, the numerical value of Gp is 


Gr ~ (300 GeV)~?. (1.31) 
If we identify this constant with g?/M¢, we obtain 
g? ~ M¥,/(300 GeV)? ~ 0.064 (1.32) 


a value quite similar to that of the electromagnetic charge e? as determined 


from e? = 4ra ~ 0.09. Though this is qualitatively correct, we shall see 
in volume 2 that the actual relation, in the electroweak theory, between the 
weak and electromagnetic coupling strengths is somewhat more complicated 
than the simple equality ‘g = e’. (Note that a corresponding connection with 
Fermi’s theory was also made by Yukawa!) 
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FIGURE 1.7 
Point-like four-fermion interaction. 


We can now understand the ‘weakness’ of the weak interactions from an- 
other viewpoint. For q? < MR,, the ratio of the electromagnetic amplitude 
(1.29) to the weak amplitude (1.30) is of order g?/M¢,, given that e ~ g. 
Thus despite having an intrinsic strength similar to that of electromagnetism, 
weak interactions will appear very weak at low energies such that q? < M3. 
At energies approaching Mw, however, weak interactions will grow in im- 
portance relative to electromagnetic ones and, when q? > Mé, weak and 
electromagnetic interactions will contribute roughly equally. 

‘Similar’ coupling strengths are still not ‘unified’, however. True unifi- 
cation only occurs after a more subtle effect has been included, which goes 
beyond the one-quantum exchange mechanism. This is the variation or ‘run- 
ning’ of the coupling strengths as a function of energy (or distance), caused 
by higher-order processes in perturbation theory. This will be discussed more 
fully in chapter 11 for QED, and in volume 2 for the other gauge couplings. 
It turns out that the possibility of unification depends crucially on an impor- 
tant difference between the weak interaction quanta W~ (to take the present 
example) and the photons of QED, which has not been apparent in the simple 
8-decay processes considered so far. The W’s are themselves ‘weakly charged’, 
acting as both carriers and sources of the weak force field, and they therefore 
interact directly amongst themselves even in the absence of other matter. 
By contrast, photons are electromagnetically neutral and have no direct self- 
interactions. In theories where the gauge quanta self-interact, the coupling 
strength decreases as the energy increases, while for QED it increases. It is 
this differing ‘evolution’ that tends to bring the strengths together, ultimately. 


Even granted similar coupling strengths and the fact that both are 4-vector 
fields, the idea of any electroweak unification appears to founder immediately 
on the markedly different ranges of the two forces or, equivalently, of the 
masses of the mediating quanta (m, = 0, Mw ~ 80 GeV!). This difficulty 
becomes even more pointed when we recall that, as previously mentioned, 
the masslessness of the photon is related to gauge invariance in electrody- 
namics: how then can there be any similar kind of gauge symmetry for weak 
interactions, given the distinctly non-zero masses of the mediating quanta? 
Nevertheless, in one of the great triumphs of 20th century theoretical physics, 
it is possible to see the two theories as essentially similar gauge theories, the 
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gauge symmetry being ‘spontaneously broken’ in the case of weak interac- 
tions. This is a central feature of the GSW electroweak theory. An indication 
of how gauge quanta might acquire mass will be given in section 11.4 but a 
fuller explanation, with application to the electroweak theory, is reserved for 
volume 2. We will have a few more words to say about it in section 1.4.1. 


1.3.6 Strong interactions 


We turn to the contemporary version of Yukawa’s theory of strong interac- 
tions, now viewed as occurring between quarks rather than nucleons. Evidence 
that the strong interquark force is in some way similar to QED comes from 
nucleon-nucleon (or nucleon-antinucleon) collisions. Regarding the nucleons 
as composites of point-like quarks, we would expect to see prominent events at 
large scattering angles corresponding to ‘hard’ q-q collisions (recall Ruther- 
ford’s discovery of the nucleus). Now the result of such a hard collision would 
normally be to scatter the quarks to wide angles, ‘breaking up’ the nucleons 
in the process. However, quarks (except for the t quark) are not observed 
as free particles. Instead, what appears to happen is that, as the two quarks 
separate from each other, their mutual potential energy increases — so much so 
that, at a certain stage in the evolution of the scattering process, the energy 
stored in the potential converts into a new qq pair. This process continues, 
with in general many pairs being produced as the original and subsequent 
pairs pull apart. By a mechanism which is still not quantitatively understood 
in detail, the produced quarks and antiquarks (and the original quarks in the 
nucleons) bind themselves into hadrons within an interaction volume of order 
1 fm, so that no free quarks are finally observed, consistent with ‘confine- 
ment’. Very strikingly, these hadrons emerge in quite well-collimated ‘jets’, 
suggesting rather vividly their ancestry in the original separating qq pair. 
Suppose, then, that we plot the angular distribution of such ‘two jet events’: 
it should tell us about the dynamics of the original interaction at the quark 
level. 


Figure 1.8 shows such an angular distribution from proton—antiproton scat- 
tering, so that the fundamental interaction in this case is the elastic scattering 
process qq —> qq. Here @ is the scattering angle in the qq centre of mass system 
(CMS). Amazingly, the 0-distribution follows almost exactly the ‘Rutherford’ 
form sin~* 0/2. 

We saw how, in the Coulomb case, this distribution could be understood 
as arising from the propagator factor 1/q?, which itself comes from the 1/r 
potential associated with the massless quantum involved, namely the photon. 
In the present case, the same 1/q? factor is responsible: here, in the qq centre 
of mass system, k and —k are the momenta of the initial q and q, while k’ and 
—k’ are the corresponding final momenta. Once again, for elastic scattering 
there is no energy transfer, and q? = —q? = —(k — k’)? = —4k’ sin? 0/2 as 
before, leading to the sin™ 8/2 form on squaring 1/q?. Once again, such a 
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FIGURE 1.8 

Angular distribution of two-jet events in pp collisions (Arnison et al. 1985) 
as a function of cos 0, where 0 is the CMS scattering angle. The broken curve 
is the prediction of QCD, obtained in the lowest order of perturbation theory 
(one-gluon exchange); it is virtually indistinguishable from the Rutherford 
(one-photon exchange) shape sin™4 6/2. The full curve includes higher order 
QCD corrections. 


distribution is a clear signal that a massless quantum is being exchanged — in 
this case, the gluon. 

It might then seem to follow that, as in the case of QED, the QCD inter- 
action has infinite range. But this cannot be right: the strong forces do not 
extend beyond the size of a typical hadron, which is roughly 1 fm. Indeed, the 
QCD force is mediated by the massless spin-1 gluon, and QCD is also a gauge 
theory; but the form of the QCD interaction, though somewhat analogous to 
QED, is more complicated, and the long range behaviour of the force is very 
different. 

As we have seen, each quark comes in three colours, and the QCD force 
is sensitive to this colour label: the gluons effectively ‘carry colour’ back and 
forth between the quarks, as shown in the one-gluon exchange process of fig- 
ure 1.9. Because the gluons carry colour, they can interact with themselves, 
like the W’s and Z’s of the GSW theory. As in that case, these gluonic 
self-interactions cause the QCD interaction strength to decrease at short dis- 
tances (or high energies), ultimately tending to zero, the property known as 
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FIGURE 1.9 

Strong scattering via gluon exchange. At the top vertex, the ‘flow’ of colour is 
b (quark) > r (quark) + rb (gluon); at the lower vertex the flow is rb (gluon) 
+ r (quark) > b (quark). 


asymptotic freedom. So in ‘hard’ collisions occurring at short inter-particle 
distances, the one-gluon exchange mechanism gives a good first approxima- 
tion to the data. But the force grows much stronger as the quarks separate 
from each other, and perturbation theory is no longer a reliable guide. In 
fact, it seems that a new, non-perturbative, effect occurs — namely confine- 
ment. Once again, a gauge theory, with formal similarity to QED, has very 
different physical consequences. 

A phenomenological qq (or qq) potential which is often used in quark 
models has the form ü 


where the first term, which dominates at small r, arises from a single-gluon 
exchange so that a ~ g2, where the strong (QCD) charge is gs. The second 
term models confinement at larger values of r. Such a potential provides 
quite a good understanding of the gross structure of the cē and bb systems 
(see problem 1.5). A typical value for b is 0.85 GeV fm! (which corresponds 
to a constant force of about 14 tonnes!). Thus at r ~ 2 fm, there is enough 
energy stored to produce a pair of the lighter quarks. This ‘linear’ part of 
the potential cannot be obtained by considering the exchange of one, or even 
a finite number of, gluons: in other words, not within an approach based on 
perturbation theory. 

It is interesting to note that the linear part of the potential may be re- 
garded as the solution of the one-dimensional form of V?V = 0, namely 
d?V/dr? = 0; this is in contrast to the Coulombic 1/r part, which is a solu- 
tion (except at r = 0) to the full three-dimensional Laplace equation. This 
suggests that the colour field lines connecting two colour charges spread out 
into all of space when the charges are close to each other, but are somehow 
‘squeezed’ into an elongated one-dimensional ‘string’ as the distance between 
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the charges becomes greater than about 1 fm. In the second volume, we shall 
see that numerical simulations of QCD, in which the space-time continuum is 
represented as a discrete lattice of points, indicate that such a linear potential 
does arise when QCD is treated non-perturbatively. It remains a challenge 
for theory to demonstrate that confinement follows from QCD. 

It is believed that gluons too are confined by QCD, so that — like quarks 
— they are not seen as isolated free particles. But they too ‘hadronize’ after 
being produced in a primitive short-distance collision process, as happens in 
the case of q’s and q’s. Such ‘gluon jets’ provide indirect evidence for the 
existence and properties of gluons, as we shall see in volume 2. 

This is an appropriate moment at which to emphasize what appears to 
be a crucial distinction between the three ‘charges’ (electromagnetic, weak 
and strong) on the one hand, and the various flavour quantum numbers on 
the other. The former have a dynamical significance, whereas the latter do 
not. In the case of electric charge, for example, this means simply that a 
particle carrying this property responds in a definite way to the presence of 
an electromagnetic field and itself creates such a field. No such force fields are 
known for any of the flavour numbers, which are (at present) purely empirical 
classification devices, without dynamical significance. 


1.3.7 The gauge bosons of the Standard Model 


We can now gather together the mediators of the SM forces. They are all gauge 
bosons, meaning that they are the quanta of various 4-vector gauge fields. For 
example, the photon is the quantum of the electromagnetic (Maxwell) 4-vector 
potential A“ (x) (see chapter 2 and section 6.3.1), which is the simplest gauge 
field. The gluon is the quantum of the QCD potential A# (x), where the colour 
index a runs from 1 to 8. The reason there are 8 of them may be guessed 
from figure 1.9: each gluon can be thought of as carrying one colour-anticolour 
combination, such as tb, bg, and so on; the symmetric combination Tr +bb 
+g is totally colourless and is discarded (see section 12.2 in volume 2). In 
the GSW electroweak theory, there are four gauge fields, W/"(x) where i runs 
from 1 to 3, and B“(x) which is analogous to A” (x). One linear combination 
of W2'(x) and B” (x) is associated with the photon field A“ (x); the orthogonal 
combination is associated with the Z“(x) field whose quantum is the Z°. The 
charged carriers WF are associated with the W/‘(x) and W# (x) components 
of the W# (x) field. 

We shall assume that the mass of the photon and of the gluon is exactly 
zero. This can never be established experimentally, of course: the current 
experimental limit on the photon mass is that it is less than 1 x 10718 (Naka- 
mura et al. 2010). All gauge fields have spin 1 (in units of ñ). Ordinarily, a 
spin-1 particle would be expected to have three polarization states, according 
to quantum mechanics. However it is a general result that in the massless 
case the quanta have only two polarization states, both transverse to the di- 
rection of motion; the longitudinally polarized state is absent (this property, 
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TABLE 1.3 
Properties of SM gauge bosons. 


Particle Polarization Mass Width/Lifetime 
states 
(photon) 2 0 (theoretical) stable 
g (gluon) 2 0 (theoretical) stable 
Ww 3 80.399 + 0.023 GeV Tw = 2.085 + 0.042 GeV 
Z? 3 91.187 + 0.0021 GeV Tz = 2.4952 + 0.0023 GeV 


familiar for the corresponding classical fields which are purely transverse, will 
be discussed in section 7.3.1). By contrast, all three polarization states are 
present for the massive gauge bosons. 

The photon and the gluon are stable particles. The WË and Z° particles 
decay with total widths of the order of 2 GeV (lifetimes ~ 0.3 x 10724 s). 
Although this is significantly shorter than typical strong interaction decay 
lifetimes, these are of course weak decays, the rate being enhanced by the 
large energy release. 

Table 1.3 lists the properties of the SM gauge bosons; the masses and 
widths are taken from Nakamura et al. (2010). 
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1.4 Renormalization and the Higgs sector of the 
Standard Model 


1.4.1 Renormalization 


So far we have been discussing processes in which only one particle is ex- 
changed. These will generally be the terms of lowest order in a perturbative 
expansion in powers of the coupling strength. But we must clearly go beyond 
lowest order, and include the effects of multi-particle exchanges. We shall 
explain how to do this in chapter 10, for a simple scalar field theory. Such 
multi-particle exchange amplitudes are given by integrals over the momenta 
of the exchanged particles, constrained only by four-momentum conservation 
(no integral arises in the case of the exchange of a single particle, because its 
four-momentum is fixed in terms of the momenta of the scattering particles, 
as in section 1.2.3). It turns out that the integrals nearly always diverge as the 
momenta of the exchanged particles tend to infinity. Nevertheless, as we shall 
explain in chapter 10, this theory can be reformulated, by a process called 
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renormalization, in such a way that all multi-particle (higher-order) processes 
become finite and calculable — a quite remarkable fact, and one that is of 
course an absolutely crucial requirement in the case of the Standard Model 
interactions, where the relevant data are precise enough to test the accuracy 
of the theory well beyond lowest order, particularly in the case of QED (see 
chapter 11). The price to be paid for this taming of the divergences is just 
that the basic parameters of the theory, such as masses and coupling con- 
stants, have to be treated as parameters to be determined by comparison to 
the data, and cannot themselves be calculated. 

But some theories cannot be reformulated in this way — they are non- 
renormalizable. A simple test for whether a theory is renormalizable or not 
will be discussed in section 11.8: if the coupling constant has dimensions of 
a mass to an inverse power, the theory is non-renormalizable. An example of 
such a theory is the original four-Fermi theory of weak interactions, where the 
coupling constant Grp has the dimensions of an inverse square mass (or energy) 
as we saw in (1.31). We will look at this theory again in section 11.8, but the 
essential point for our purpose now is that the dimensionful coupling constant 
introduces an energy scale into the problem, namely Gp ~ 300 GeV. 
It seems reasonable to infer that a more relevant measure of the interaction 
strength will be given by the dimensionless number EGY 2 where E is a 
characteristic physical energy scale of any weak process under consideration 
— for example, the energy in the centre of momentum frame in a two-particle 
scattering process, at least at energies much greater than the particle masses. 
Then, for energies very much less than Gu ? the effective strength will be 
very weak, and the lowest order term in perturbation theory will work fine; 
this is how the Fermi theory was used, for many years. But as the energy 
increases, what happens is that more and more parameters have to be taken 
from experiment, in order to control the divergences; as the energy approaches 
Gy 1/ a the theory becomes totally non-predictive and breaks down. Thus 
renormalizability is regarded as highly desirable in a theory. 

One might hope to come up with a renormalizable theory of weak interac- 
tions by replacing the four-fermion interaction by a Yukawa-like mechanism, 
with exchange of a quantum of mass M and dimensionless coupling y, say. 
Then just as in (1.32) we would identify Gp ~ y?/M? at low energies. How- 
ever, as we have seen, phenomenology implies that the massive exchanged 
quantum must have spin 1. Unfortunately, this type of straightforward mas- 
sive spin-1 theory is not renormalizable either, as we shall discuss in chapter 
22 (in volume 2). The trouble can be traced directly to the existence of the 
longitudinal polarization state which, as noted previously, is present for a 
massive spin-1 particle. If the exchanged spin-1 quantum were massless, as 
in QED, it would lack that third polarization state, and the theory would be 
renormalizable. But weak interaction facts dictate both non-zero mass and 
spin-1. 

In the case of QED, there is a symmetry principle behind both the zero 
mass of the photon and the absence of the longitudinal polarization state: 
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this symmetry is gauge invariance as we shall explain in section 7.3.1. It 
turns out that this symmetry is vital in rendering QED renormalizable. It is 
natural then to ask whether in the case of QED, a situation ever arises where 
the photon acquires mass, while retaining fully gauge-invariant interactions — 
and hence renormalizability (we would hope). If so, we would then have an 
analogue of what is needed for a renormalizable theory of weak interactions. 
The answer is that this can indeed happen, but it requires some extra dynamics 
to do it. Nature has actually provided us with a working model of what we 
want, in the phenomenon of superconductivity. There, the Meissner effect can 
be interpreted as implying that the photons propagating in a thin surface layer 
of the material have non-zero mass (see section 19.2). The dynamics behind 
this is subtle, and required many years of theoretical efforts before it was 
finally understood by Bardeen, Cooper and Schrieffer (1957). In simple terms, 
the mechanism is a two-step process. First, lattice interactions cause electrons 
to bind into pairs; then these pairs undergo Bose-Einstein condensation. This 
‘condensate’ is the BCS superconducting ground state. The essential point is 
that although the electromagnetic interactions are fully gauge invariant, the 
ground state is not. When a symmetry is broken by the ground state, it is 
said to be ‘spontaneously’ broken. We shall provide an introduction to the 
BCS ground state in chapter 17 of volume 2. 


The BCS theory is an example of spontaneous symmetry breaking oc- 
curring dynamically (through the particular lattice interactions). Many of 
the physically important phenomena can, however, be very satisfactorily de- 
scribed in terms of an effective theory, which treats only the electrodynamics 
of the condensate. Such a description was proposed by Ginzburg and Landau 
(1950), well before the BCS paper, in fact. 


How can this be applied in particle physics? Recall the idea, mentioned 
in section 1.3.1, that the analogue of the many-body ground state is the qft 
vacuum (Nambu 1961). In the Standard Model, the weak interactions are 
indeed described by a gauge-invariant theory, and the assumption is made 
that the vacuum breaks the gauge symmetry. The simplest way this idea 
can be implemented is along the lines of the Ginzburg-Landau theory, as 
suggested by Weinberg (1967) and by Salam (1968), and their proposal is em- 
bodied in the Glashow-Salam-Weinberg electroweak theory, which is part of 
the SM. It requires the introduction of four new spin-0 fields, which are called 
Higgs fields (Higgs 1964, Englert and Brout 1964, Guralnik et al. 1964), 
and which we may think of as playing the role of the BCS condensate (but 
not for electromagnetism, of course). The combined theory of quarks, lep- 
tons, electroweak gauge fields, and Higgs fields is gauge invariant, but one of 
the Higgs fields is supposed to have a non-zero average value in the physical 
vacuum, which breaks the gauge symmetry. The other three Higgs fields effec- 
tively become the longitudinal parts of the massive spin-1 WË and Z° fields, 
while the quantized excitations of the fourth Higgs field away from its vac- 
uum value appear physically as neutral spin-0 particles, called Higgs bosons 
(Higgs 1964). 
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Apart from giving mass to the WE and Z°, the Higgs fields have more 
work to do. The electroweak gauge symmetry is exact only if all the fermion 
masses are zero; this is because it is a chiral symmetry (similar to, but not 
the same as, the chiral symmetry of QCD mentioned in section 1.2.2). Once 
again, this chiral gauge symmetry is essential to the renormalizability of the 
theory: if the fermion masses are incorporated in the usual way as parameters 
in the Lagrangian, the latter is no longer gauge invariant and the theory is 
non-renormalizable. In the SM, this problem is solved by having no fermion 
masses in the Lagrangian, and by postulating gauge-invariant Yukawa inter- 
actions between the fermions and the Higgs fields, which are arranged in such 
a way that, when the Higgs field gets a vacuum expectation value, the inter- 
action terms yield just the fermion masses. So again, the symmetry breaking 
is economically blamed on the same property of the vacuum. When the Higgs 
field oscillates away from its vacuum value, the result will be residual in- 
teractions between the fermions and the Higgs boson, which will have the 
defining characteristic that each fermion will interact with the Higgs boson 
with a strength proportional to its (i.e. the fermion’s) mass. This is clearly a 
testable prediction, once the Higgs boson is found. 

We have emphasized the role that the Higgs fields play in the renormaliz- 
ability of the GSW theory. The all-important proof of that renormalizability 
was given by ’t Hooft (1971b), and he also proved the renormalizability of 
QCD (1971a); see also ’t Hooft and Veltman (1972). 

The SM Higgs sector is the simplest one that will do the job; more compli- 
cated versions are possible. Perhaps the Higgs field is a composite formed in 
some new heavy fermion-antifermion dynamics, reminiscent of BCS pairing. 
In any case, the SM Higgs sector is there to be tested experimentally. In the 
following section we shall discuss briefly what is presently known about the 
SM Higgs boson, postponing a fuller discussion until we present the GSW 
theory in chapter 22 in volume 2. 

Before ending this section we must note that modern renormalization the- 
ory is concerned with more than perturbative calculability. The renormaliza- 
tion group and related ideas provide powerful tools for ‘improving’ perturba- 
tion theory, by systematically resumming terms which (in the particle physics 
case) dominate at short distances. Prominent among the results of this analy- 
sis (see chapters 15 and 16) are the concepts of energy-dependent (‘running’) 
masses and coupling strengths, and the calculation of QCD corrections to 
parton-model predictions. 


1.4.2 The Higgs boson of the Standard Model 


According to the SM, just one neutral spin-0 Higgs boson is expected; its 
mass my is not predicted by the theory. The experimental discovery of the 
SM Higgs boson has been a major goal of several generations of accelerators: 
the LEP ete~ collider at Cern, the Tevatron pp collider at Fermilab, and 
most recently the LHC pp collider at Cern. Experimentally, bounds on the 
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Higgs mass can be obtained directly, through searching for its production and 
subsequent decay; non-observation will lead to a lower bound for my. There 
are also indirect constraints, coming from fits to precision measurements of 
electroweak observables. The latter are sensitive to higher order corrections 
which involve the Higgs boson as a virtual particle; these depend logarithmi- 
cally on the unknown parameter my and give upper bounds on my, assuming, 
of course, that the SM is correct. 
A lower bound 


my > 114.4 GeV (95% C.L.) (1.34) 


was set at LEP (LEP 2003) by combining data on direct searches. Combining 
this with a global fit to precision electroweak data, an upper bound 


mu <186 GeV (95% C.L.) (1.35) 


was obtained (Nakamura et al. 2010). 

By early 2012, the combined results of the CDF and DO experiments at 
the Tevatron, and the ATLAS and CMS experiments at the LHC, excluded an 
my value in the interval (approximately) 130 GeV to 600 GeV, at 95 % C.L. 
Finally, in July 2012 the ATLAS (Aad et al. 2012) and CMS (Chatrchyan et 
al. 2012) collaborations announced the discovery, with a significance of 5c, 
of a neutral boson with a mass in the range 125-126 GeV, its production and 
decay rates being broadly compatible with the predictions for the SM Higgs 
boson. The existence of the measured decay to two photons implies that the 
particle is a boson with spin different from 1 (Landau 1948, Yang 1950), but 
spin-0 has not yet been confirmed. Nevertheless, it is probable that this is the 
(or perhaps a) Higgs boson. Its long-anticipated discovery opens a new era 
in particle physics: the experimental exploration of the symmetry-breaking 
sector of the SM. 


a FEE 


1.5 Summary 


The Standard Model provides a relatively simple picture of quarks and leptons 
and their non-gravitational interactions. The quark colour triplets are the 
basic source particles of the gluon fields in QCD, and they bind together to 
make hadrons. The weak interactions involve quark and lepton doublets — for 
instance the quark doublet (u,d) and the lepton doublet (1%,e—) of the first 
generation. These are sources for the WF and Z° fields. Charged fermions 
(quarks and leptons) are sources for the photon field. All the mediating force 
quanta have spin-1. The weak and strong force fields are generalizations of 
electromagnetism; all three are examples of gauge theories, but realized in 
subtly different ways. 
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In the following chapters our aim will be to lead the reader through the 
mathematical formalism involved in giving precise quantitative form to what 
we have so far described only qualitatively and to provide physical interpre- 
tation where appropriate. In the remainder of part I of the present volume, 
we first show how Schrédinger’s quantum mechanics and Maxwell’s electro- 
magnetic theory may be combined as a gauge theory — in fact the simplest 
example of such a theory. We then introduce relativistic quantum mechanics 
for spin-0 and spin-$ particles, and include electromagnetism via the gauge 
principle. Lorentz transformations and discrete symmetries are also covered. 
In part II, we develop the formalism of quantum field theory, beginning with 
scalar fields and moving on to QED; this is then applied to many simple (‘tree 
level’) QED processes in part III. In the final part IV, we present an intro- 
duction to renormalization at the one-loop level, including renormalization 
of QED. The more complicated gauge theories of QCD and the electroweak 
theory are reserved for volume 2. 


ÁÁ]. 
Problems 


1.1 Evaluate the integral in (1.26) directly. [Hint: Use spherical polar coordi- 
nates with the polar axis along the direction of q, so that d°r = r?dr sin @ dé dé, 
and exp(iq - r) = exp(i|g|r cos@). Make the change of variable x = cos 6, and 
do the ¢ integral (trivial) and the x integral. Finally do the r integral.] 


1.2 Using the concept of strangeness conservation in strong interactions, ex- 
plain why the threshold energy (for 7~ incident on stationary protons) for 


a +p —> K’ + anything 


is less than for 


nq +p —> K’ + anything 
assuming both processes proceed through the strong interaction. 


1.3 Note: the invariant square p? of a 4-momentum p = (F, p) is defined as 
p? = E? — p°. We remind the reader that h = c = 1 (see Appendix B). 


(a) An electron of 4-momentum k scatters from a stationary proton 
of mass M via a one-photon exchange process, producing a final 
hadronic state of 4-momentum p’, the final electron 4-momentum 
being k’. Show that 


p? =@+2M(E— E') + M? 


where q? = (k — k’)?, and E, E’ are the initial and final electron 
energies in this frame (i.e. the one in which the target proton is 
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at rest). Show that if the electrons are highly relativistic then 
q? = —4EE' sin? 0/2, where 0 is the scattering angle in this frame. 
Deduce that for elastic scattering E’ and 0 are related by 


2E 
' a 
E =B/ (14E sn 6/2). 


Electrons of energy 4.879 GeV scatter elastically from protons, with 
0 = 10°. What is the observed value of E’? 


In the scattering of these electrons, at 10°, it is found that there is 
a peak of events at E’ = 4.2 GeV; what is the invariant mass of the 
produced hadronic state (in MeV)? 


Calculate the value of Æ’ at which the ‘quasi-elastic peak’ will be 
observed, when electrons of energy 400 MeV scatter at an angle 
0 = 45° from a He nucleus, assuming that the struck nucleon is at 
rest inside the nucleus. Estimate the broadening of this final peak 
caused by the fact that the struck nucleon has, in fact, a momentum 
distribution by virtue of being localized within the nuclear size. 


In a simple non-relativistic model of a hydrogen-like atom, the en- 
ergy levels are given by 


where Z is the nuclear charge and u is the reduced mass of the 
electron and nucleus. Calculate the splitting in eV between the 
n = 1 and n = 2 states in positronium, which is an ete~ bound 
state, assuming this model holds. 


In this model, the ete~ potential is the simple Coulomb one 


e2 


4TEor r` 


Suppose that the potential between a heavy quark Q and an anti- 
quark Q was 
Qs 


F 
where ag is a ‘strong fine structure constant’. Calculate values of 
as (different in (i) and (ii)) corresponding to the information (the 
quark masses are phenomenological ‘quark model’ masses) 


(i) the splitting between the n = 2 and n = 1 states in charmonium 
(cé) is 588 MeV, and me = 1870 MeV; 
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(ii) the splitting between the n = 2 and n = 1 states in the upsilon 
series (bb) is 563 MeV, and m» = 5280 MeV. 


(c) In positronium, the n = 13S; and n = 11Sp states are split by the 
hyperfine interaction, which has the form atmo -02 where Me 
is the electron mass and 01,02 are the spin matrices for the e7 
and et respectively. Calculate the expectation value of o1 +o» in 
the °S; and ‘Sp states, and hence evaluate the splitting between 
these levels (calculated in lowest order perturbation theory) in eV. 
[Hint: the total spin S is given by S = (F1 + 02). So S? = 
(o? +03 + 201-02). Hence the eigenvalues of o1 -o2 are directly 


related to those of S?.] 


(d) Suppose an analogous ‘strong’ hyperfine interaction existed in the ct 
system, and was responsible for the splitting between the n = 13S, 
and n = 11Sp states, which is 116 MeV experimentally (i.e. replace 
a by as and me. by Me = 1870 MeV). Calculate the corresponding 
value of as. 


1.5 The potential between a heavy quark Q and an antiquark Q is found 
empirically to be well represented by 


V(r) = LEE ate 
r 


where a, © 0.5 and b = 0.18 GeV”. Indicate the origin of the first term in 
V(r), and the significance of the second. 

An estimate of the ground-state energy of the bound QQ system may be 
made as follows. For a given r, the total energy is 


2 
E(r) = 2m — & +br + 
r m 


where m is the mass of the Q (or Q) and p is its momentum (assumed non- 
relativistic). Explain why p may be roughly approximated by 1/r, and sketch 
the resulting E(r) as a function of r. Hence show that, in this approximation, 
the radius of the ground state, ro, is given by the solution of 


2 a 
een) 
mro to 


Taking m = 1.5 GeV as appropriate to the c¢ system, verify that for this 
system 
(1/ro) ~ 0.67 GeV 


and calculate the energy of the ct ground state in GeV, according to this 
model. 

An excited c€ state at 3.686 GeV has a total width of 278 keV, and one 
at 3.77 GeV has a total width of 24 MeV. Comment on the values of these 
widths. 
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1.6 The Hamiltonian for a two-state system using the normalized base states 
|1),|2) has the form 


(1|JH|1) (1|H|2) \ _ ( —acos2@ asin 26 
(2|H|1) (2|H|2) J asin2@  acos26 


where a is real and positive. Find the energy eigenvalues F} and E_, and 
express the corresponding normalized eigenstates |+) and |—) in terms of |1) 
and |2). 

At time t = 0 the system is in state |1). Show that the probability that it 
will be found to be in state |2) at a later time t is 


sin? 20 sin? (at). 


Discuss how a formalism of this kind can be used in the context of neutrino 
oscillations. How might the existence of neutrino oscillations explain the solar 
neutrino problem? (This will be discussed in chapter 21 of volume 2.) 


1.7 In an interesting speculation, it has been suggested (Arkani-Hamad et al. 
1998, 1999, Antoniadis et al. 1998) that the weakness of gravity as observed in 
our (apparently) three-dimensional world could be due to the fact that gravity 
actually extends into additional ‘compactified’ dimensions (that is, dimensions 
which have the geometry of a circle, rather than of an infinite line). For the 
particles and forces of the Standard Model, however, such leakage into extra 
dimensions has to be confined to currently probed distances, which are of 
order Mọ. 


(a) Consider Newtonian gravity in (3 + d) spatial dimensions. Explain 
why you would expect that the gravitational potential will have the 


form G 
mm 
VN,34a(r) = a (1.36) 


[Think about how the ‘1/r?’ fall-off of the force is related to the 
surface area of a sphere in the case d = 0. Note that the formula 
works for d = —2! What happens in the case d = —1?] 


(b) Show that GN,3+4 has dimensions (mass)~?+®, This allows us to 
introduce the ‘true’ Planck scale — i.e. the one for the underlying 


theory in 3+ d spatial dimensions — as Gy 344 = (Mp 34a)~?*®. 


(c) Now suppose that the form (1.36) only holds when the distance r 
between the masses is much smaller R, the size of the compactified 
dimensions. If the masses are placed at distances r >> R, their 
gravitational flux cannot continue to penetrate into the extra di- 
mensions, and the potential (1.36) should reduce to the familiar 
three-dimensional one; so we must have 


_ myme2Gn,3+a 1 


a (1.37) 


VN, 34a(r > R) = 
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Show that this implies that 


MÈ = MÊ 344(RMp,34a)". (1.38) 


(d) Suppose that d = 2 and R ~ 1mm: what would Mp 3+a be, in TeV? 
Suggest ways in which this theory might be tested experimentally. 
Taking Mp ,3+a ~ 1 TeV, explore other possibilities for d and R. 
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Electromagnetism as a Gauge Theory 


2.1 Introduction 


The previous chapter introduced the basic ideas of the Standard Model of 
particle physics, in which quarks and leptons interact via the exchange of 
gauge field quanta. We must now look more closely into what is the main 
concern of this book — namely, the particular nature of these ‘gauge 
theories’. 

One of the relevant forces — electromagnetism — has been well understood in 
its classical guise for many years. Over a century ago, Faraday, Maxwell and 
others developed the theory of electromagnetic interactions, culminating in 
Maxwell’s paper of 1864 (Maxwell 1864). Today Maxwell’s theory still stands 
—unlike Newton’s ‘classical mechanics’ which was shown by Einstein to require 
modifications at relativistic speeds, approaching the speed of light. Moreover, 
Maxwell’s electromagnetism, when suitably married with quantum mechanics, 
gives us ‘quantum electrodynamics’ or QED. We shall see in chapter 10 that 
this theory is in truly remarkable agreement with experiment. As we have 
already indicated, the theories of the weak and strong forces included in the 
Standard Model are generalizations of QED, and promise to be as successful 
as that theory. The simplest of the three, QED, is therefore our paradigmatic 
theory. 

From today’s perspective, the crucial thing about electromagnetism is that 
it is a theory in which the dynamics (i.e. the behaviour of the forces) is 
intimately related to a symmetry principle. In the everyday world, a symmetry 
operation is something that can be done to an object that leaves the object 
looking the same after the operation as before. By extension, we may consider 
mathematical operations — or ‘transformations’ — applied to the objects in our 
theory such that the physical laws look the same after the operations as they 
did before. Such transformations are usually called invariances of the laws. 
Familiar examples are, for instance, the translation and rotation invariance 
of all fundamental laws: Newton’s laws of motion remain valid whether or 
not we translate or rotate a system of interacting particles. But of course — 
precisely because they do apply to all laws, classical or quantum — these two 
invariances have no special connection with any particular force law. Instead, 
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they constrain the form of the allowed laws to a considerable extent, but by 
no means uniquely determine them. Nevertheless, this line of argument leads 
one to speculate whether it might in fact be possible to impose further types 
of symmetry constraints so that the forms of the force laws are essentially 
determined. This would then be one possible answer to the question: why are 
the force laws the way they are? (Ultimately of course this only replaces one 
question by another!) 

In this chapter we shall discuss electromagnetism from this point of view. 
This is not the historical route to the theory, but it is the one which generalizes 
to the other two interactions. This is why we believe it important to present 
the central ideas of this approach in the familiar context of electromagnetism 
at this early stage. 

A distinction that is vital to the understanding of all these interactions 
is that between a global invariance and a local invariance. In a global in- 
variance the same transformation is carried out at all space-time points: it 
has an ‘everywhere simultaneously’ character. In a local invariance different 
transformations are carried out at different individual space-time points. In 
general, as we shall see, a theory that is globally invariant will not be invari- 
ant under locally varying transformations. However, by introducing new force 
fields that interact with the original particles in the theory in a specific way, 
and which also transform in a particular way under the local transformations, 
a sort of local invariance can be restored. We will see all these things more 
clearly when we go into more detail, but the important conceptual point to be 
grasped is this: one may view these special force fields and their interactions 
as existing in order to permit certain local invariances to be true. The par- 
ticular local invariance relevant to electromagnetism is the well-known gauge 
invariance of Maxwell’s equations: in the quantum form of the theory this 
property is directly related to an invariance under local phase transformations 
of the quantum fields. A generalized form of this phase invariance also under- 
lies the theories of the weak and strong interactions. For this reason they are 
all known as ‘gauge theories’. 

A full understanding of gauge invariance in electrodynamics can only be 
reached via the formalism of quantum field theory, which is not easy to mas- 
ter — and the theory of quantum gauge fields is particularly tricky, as we 
shall see in chapter 7. Nevertheless, many of the crucial ideas can be per- 
fectly adequately discussed within the more familiar framework of ordinary 
quantum mechanics, rather than quantum field theory, treating electromag- 
netism as a purely classical field. This is the programme followed in the rest 
of part I of this volume. In the present chapter we shall discuss these ideas in 
the context of non-relativistic quantum mechanics; in the following two chap- 
ters, we shall explore the generalization to relativistic quantum mechanics, 
for particles of spin-0 (via the Klein-Gordon equation) and spin-4 (via the 
Dirac equation). While containing substantial physics in their own right, these 
chapters constitute essential groundwork for the quantum field treatment in 
parts II-IV. 
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2.2 The Maxwell equations: current conservation 


Question: Would you distinguish local conservation laws from global con- 
servation laws. 

Feynman: If a cat were to disappear in Pasadena and at the same time 
appear in Erice, that would be an example of global conservation of cats. 
This is not the way cats are conserved. Cats or charge or baryons are 
conserved in a much more continuous way. If any of these quantities be- 
gin to disappear in a region, then they begin to appear in a neighbouring 
region. Consequently, we can identify the flow of charge out of a region 
with the disappearance of charge inside the region. This identification of 
the divergence of a flux with the time rate of change of a charge density is 
called a local conservation law. A local conservation law implies that the 
total charge is conserved globally, but the reverse does not hold. However, 
relativistically it is clear that non-local global conservation laws cannot 
exist, since to a moving observer the cat will appear in Erice before it 
disappears in Pasadena. 


— From the question-and-answer session following a lecture by R. P. Feyn- 
man at the 1964 International School of Physics ‘Ettore Majorana’ (Feyn- 
man 1965b). 


We begin by considering the basic laws of classical electromagnetism, the 
Maxwell equations. We use a system of units (Heaviside-Lorentz) which is 
convenient in particle physics (see appendix C). Before Maxwell’s work these 
laws were 


V-E = pm (Gauss’ law) (2.1) 
OB 

VxXE = -= (Faraday—Lenz laws) (2.2) 

V-B = 0 (no magnetic charges) (2.3) 


and, for steady currents, 
V xB= jam (Ampère’s law). (2.4) 


Here pem is the charge density and Jem is the current density; these densities 
act as ‘sources’ for the E and B fields. Maxwell noticed that taking the 
divergence of this last equation leads to conflict with the continuity equation 
for electric charge 


Pem ” E 
At + V+ jem = 0. (2.5) 
Since 
V-(V x B)=0 (2.6) 


from (2.4) there follows the result 


V ‘Jem = 9. (2.7) 
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This can only be true in situations where the charge density is constant in 
time. For the general case, Maxwell modified Ampére’s law to read 


OE 


VEE deat ae 


(2.8) 
which is now consistent with (2.5). Equations (2.1)—(2.3), together with (2.8), 
constitute Maxwell’s equations in free space (apart from the sources). 

It is worth spending a moment on the vitally important continuity equation 
(2.5) — note the Feynman quotation at the start of this section. Let us integrate 
this equation over any arbitrary volume Q, and write the result as 


o 


5 | Pema = — | V jem. (2.9) 
Q Q 


Equation (2.9) states that the rate of decrease of charge in any arbitrary 
volume Q is due precisely and only to the flux of current out of its surface; 
that is, no net charge can be created or destroyed in Q. Since Q can be 
made as small as we please, this means that electric charge must be locally 
conserved: a process in which charge is created at one point and destroyed at a 
distant one is not allowed, despite the fact that it conserves the charge overall 
or ‘globally’. The ultimate reason for this is that the global form of charge 
conservation would necessitate the instantaneous propagation of signals (such 
as ‘now, create a positron over there’), and this conflicts with special relativity 
— a theory which, historically, flowered from the soil of electrodynamics. The 
extra term introduced by Maxwell — the ‘electric displacement current’ — owes 
its place in the dynamical equations to a local conservation requirement. 

We remark at this point that we have just introduced another local/global 
distinction, similar to that discussed earlier in connection with invariances. In 
this case the distinction applies to a conservation law, but since invariances 
are related to conservation laws in both classical and quantum mechanics, we 
should perhaps not be too surprised by this. However, as with invariances, 
conservation laws — such as charge conservation in electromagnetism — play a 
central role in gauge theories in that they are closely related to the dynamics. 
The point is simply illustrated by asking how we could measure the charge 
of a newly created subatomic particle X. There are two conceptually different 
ways: 


(i) We could arrange for X to be created in a reaction such as 
A+B>C+D+X 


where the charges of A, B, C and D are already known. In this case 
we can use charge conservation to determine the charge of X. 


(ii) We could see how particle X responded to known electromagnetic 
fields. This uses dynamics to determine the charge of X. 
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Hither way gives the same answer: it is the conserved charge which deter- 
mines the particle’s response to the field. By contrast, there are several other 
conservation laws that seem to hold in particle physics, such as lepton number 
and baryon number, that apparently have no dynamical counterpart (cf the 
remarks at the end of section 1.3.6). To determine the baryon number of a 
newly produced particle, we have to use B conservation and tot up the total 
baryon number on either side of the reaction. As far as we know there is no 
baryonic force field. 

Thus gauge theories are characterized by a close interrelation between three 
conceptual elements: symmetries, conservation laws and dynamics. In fact, 
it is now widely believed that the only exact quantum number conservation 
laws are those which have an associated gauge theory force field — see com- 
ment (i) in section 2.6. Thus one might suspect that baryon number is not 
absolutely conserved — as is indeed the case in proposed unified gauge theo- 
ries of the strong, weak and electromagnetic interactions. In this discussion 
we have briefly touched on the connection between two pairs of these three 
elements: symmetries + dynamics; and conservation laws + dynamics. The 
precise way in which the remaining link is made — between the symmetry 
of electromagnetic gauge invariance and the conservation law of charge — is 
more technical. We will discuss this connection with the help of simple ideas 
from quantum field theory in chapter 7, section 7.4. For the present we con- 
tinue with our study of the Maxwell equations and, in particular, of the gauge 
invariance they exhibit. 


a 


2.3 The Maxwell equations: Lorentz covariance and gauge 
invariance 
In classical electromagnetism, and especially in quantum mechanics, it is con- 


venient to introduce the vector potential A,,(x) in place of the fields E and 
B. We write: 


B=VxA (2.10) 
OA 


which defines the 3-vector potential A and the scalar potential V. With these 
definitions, equations (2.2) and (2.3) are then automatically satisfied. 

The origin of gauge invariance in classical electromagnetism lies in the 
fact that the potentials A and V are not unique for given physical fields Æ 
and B. The transformations that A and V may undergo while preserving 
E and B (and hence the Maxwell equations) unchanged are called gauge 
transformations, and the associated invariance of the Maxwell equations is 
called gauge invariance. 
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What are these transformations? Clearly A can be changed by 
A> A'=A+Vx (2.12) 


where y is an arbitrary function, with no change in B since V x V f = 0, for 
any scalar function f. To preserve E, V must then change simultaneously by 


VoV=V- 7 (2.13) 


These transformations can be combined into a single compact equation by 
introducing the 4-vector potential! : 


At = (V, A) (2.14) 


and noting (from problem 2.1) that the differential operators (0/0t, —V) form 
the components of a 4-vector operator 0“. A gauge transformation is then 


specified by 
Al — A'¥ = Al — Oy, (2.15) 


The Maxwell equations can also be written in a manifestly Lorentz covariant 
form (see appendix D) using the 4-current j#,, given by 


Jem = (Pem, Jem) (2.16) 
in terms of which the continuity equation takes the form (problem 2.1): 
Ondbm =O: (2.17) 
The Maxwell equations (2.1) and (2.8) then become (problem 2.2): 
OLF = jha (2.18) 
where we have defined the field strength tensor: 
PHY = o! A” — OV A". (2.19) 
Under the gauge transformation 
A" — A" = A” — "x (2.20) 
FY” remains unchanged: 
F _, FRY = Frew (2.21) 
so F¥” is gauge invariant and so, therefore, are the Maxwell equations in 


1See appendix D for relativistic notation and for an explanation of the very important 
concept of covariance, which we are about to invoke in the context of Lorentz transforma- 
tions, and will use again in the next section in the context of gauge transformations; we 
shall also use it in other contexts in later chapters. 
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the form (2.18). The ‘Lorentz-covariant and gauge-invariant field equations’ 
satisfied by A” then follow from equations (2.18) and (2.19): 


A” — 8 (3 A") = 5%. (2.22) 


Since gauge transformations turn out to be of central importance in the 
quantum theory of electromagnetism, it would be nice to have some insight 
into why Maxwell ’s equations are gauge invariant. The all-important ‘fourth’ 
equation (2.8) was inferred by Maxwell from local charge conservation, as 
expressed by the continuity equation 


Onjt, = 0. (2.23) 


The field equation 
3p F” = Jen (2.24) 


then of course automatically embodies (2.23). The mathematical reason it 
does so is that F#” is a four-dimensional kind of ‘curl’ 


FH” = ah A” — 8” A" (2.25) 
which (as we have seen in (2.21)) is unchanged by a gauge transformation 
A" — A" = A" — Oly, (2.26) 


Hence there is the suggestion that the gauge invariance is related in some way 
to charge conservation. However, the connection is not so simple. Wigner 
(1949) has given a simple argument to show that the principle that no phys- 
ical quantity can depend on the absolute value of the electrostatic poten- 
tial, when combined with energy conservation, implies the conservation of 
charge. Wigner’s argument relates charge (and energy) conservation to an 
invariance under transformation of the electrostatic potential by a constant: 
charge conservation alone does not seem to require the more general space- 
time-dependent transformation of gauge invariance. 

Changing the value of the electrostatic potential by a constant amount is 
an example of what we have called a global transformation (since the change 
in the potential is the same everywhere). Invariance under this global trans- 
formation is related to a conservation law: that of charge. But this global 
invariance is not sufficient to generate the full Maxwellian dynamics. How- 
ever, as remarked by ’t Hooft (1980), one can regard equations (2.12) and 
(2.13) as expressing the fact that the local change in the electrostatic poten- 
tial V (the x/ðt term in (2.13)) can be compensated — in the sense of leaving 
the Maxwell equations unchanged — by a corresponding local change in the 
magnetic vector potential A. Thus by including magnetic effects, the global 
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invariance under a change of V by a constant can be extended to a local in- 
variance (which is a much more restrictive condition to satisfy). Hence there 
is a beginning of a suggestion that one might almost ‘derive’ the complete 
Maxwell equations, which unify electricity and magnetism, from the require- 
ment that the theory be expressed in terms of potentials in such a way as 
to be invariant under local (gauge) transformations on those potentials. Cer- 
tainly special relativity must play a role too: this also links electricity and 
magnetism, via the magnetic effects of charges as seen by an observer moving 
relative to them. If a 4-vector potential A” is postulated, and it is then de- 
manded that the theory involve it only in a way which is insensitive to local 
changes of the form (2.15), one is led naturally to the idea that the phys- 
ical fields enter only via the quantity F””, which is invariant under (2.15). 
From this, one might conjecture the field equation on grounds of Lorentz 
covariance. 

It goes without saying that this is certainly not a ‘proof’ or ‘derivation’ of 
the Maxwell equations. Nevertheless, the idea that dynamics (in this case, the 
complete interconnection of electric and magnetic effects) may be intimately 
related to a local invariance requirement (in this case, electromagnetic gauge 
invariance) turns out to be a fruitful one. As indicated in section 2.1, it is 
generally the case that, when a certain global invariance is generalized to a 
local one, the existence of a new ‘compensating’ field is entailed, interacting in 
a specified way. The first example of dynamical theory ‘derived’ from a local 
invariance requirement seems to be the theory of Yang and Mills (1954) (see 
also Shaw 1955). Their work was extended by Utiyama (1956), who developed 
a general formalism for such compensating fields. As we have said, these types 
of dynamical theories, based on local invariance principles, are called gauge 
theories. 

It is a remarkable fact that the interactions in the Standard Model of par- 
ticle physics are of precisely this type. We have briefly discussed the Maxwell 
equations in this light, and we will continue with (quantum) electrodynam- 
ics in the following two sections. The two other fundamental interactions 
— the strong interaction between quarks and the weak interaction between 
quarks and leptons — also seem to be described by gauge theories (of essen- 
tially the Yang-Mills type), as we shall see in detail in the second volume of 
this book. A fourth example, but one which we shall not pursue in this book, 
is that of general relativity (the theory of gravitational interactions). Utiyama 
(1956) showed that this theory could be arrived at by generalizing the global 
(space-time independent) coordinate transformations of special relativity to 
local ones; as with electromagnetism, the more restrictive local invariance 
requirements entailed the existence of a new field — the gravitational one — 
with an (almost) prescribed form of interaction. Unfortunately, despite this 
‘gauge’ property, no consistent quantum field theory of general relativity is 
known. 

In order to proceed further, we must now discuss how such (gauge) ideas 
are incorporated into quantum mechanics. 
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2.4 Gauge invariance (and covariance) in quantum 
mechanics 


The Lorentz force law for a non-relativistic particle of charge q moving with 
velocity v under the influence of both electric and magnetic fields is 


F=qE+q.<xbB. (2.27) 
It may be derived, via Hamilton’s equations, from the classical Hamiltonian? 


_ 1 2 
H= z P GA) + qV. (2.28) 


The Schrödinger equation for such a particle in an electromagnetic field is 


1 Ow (a, t) 

—(-~iV —aqA) H =i-—~+ 2.2 
(CV — aay? + av ) ven) = (2.29) 
which is obtained from the classical Hamiltonian by the usual prescription, 
p — —iV, for Schrédinger’s wave mechanics (fi = 1). Note the appearance of 
the operator combinations 


D=V-igqA 
(2.30) 


D? =90/dt+iqV 


in place of V and 0/0¢t, in going from the free-particle Schrödinger equation 
to the electromagnetic field case. 

The solution ~(a,t) of the Schrödinger equation (2.29) describes com- 
pletely the state of the particle moving under the influence of the potentials 
V, A. However, these potentials are not unique, as we have already seen: 
they can be changed by a gauge transformation 


A>A' = A+Vx (2.31) 
Va = -gyo (2.32) 


and the Maxwell equations for the fields EF and B will remain the same. 
This immediately raises a serious question: if we carry out such a change 
of potentials in equation (2.29), will the solution of the resulting equation 
describe the same physics as the solution of equation (2.29)? If it does, 
we shall be able to assume the validity of Maxwell’s theory for the quan- 
tum world; if not, some modification will be necessary, since the gauge sym- 
metry possessed by the Maxwell equations will be violated in the quantum 
theory. 


2We set ñ = c = 1 throughout (see appendix B). 


50 2. Electromagnetism as a Gauge Theory 


The answer to the question just posed is evidently negative, since it is 
clear that the same ‘yw’ cannot possibly satisfy both (2.29) and the analogous 
equation with (V, A) replaced by (V’, A’). Unlike Maxwell’s equations, the 
Schrodinger equation is not gauge invariant. But we must remember that the 
wavefunction w is not a directly observable quantity, as the electromagnetic 
fields E and B are. Perhaps w does not need to remain unchanged (invari- 
ant) when the potentials are changed by a gauge transformation. In fact, 
in order to have any chance of ‘describing the same physics’ in terms of the 
gauge-transformed potentials, we will have to allow w to change as well. This 
is a crucial point: for quantum mechanics to be consistent with Maxwell’s 
equations it is necessary for the gauge transformations (2.31) and (2.32) of 
the Maxwell potentials to be accompanied also by a transformation of the 
quantum-mechanical wavefunction, Y + ~’, where w’ satisfies the equation 

f 

(Sav —qA') + av’) (nt) = Pen, (2.33) 
Note that the form of (2.33) is exactly the same as the form of (2.29) — it is 
this that will effectively ensure that both ‘describe the same physics’. Readers 
of appendix D will expect to be told that — if we can find such a w’ — we may 
then assert that (2.29) is gauge covariant, meaning that it maintains the same 
form under a gauge transformation. (The transformations relevant to this use 
of ‘covariance’ are gauge transformations.) 

Since we know the relations (2.31) and (2.32) between A, V and A’, V’, 
we can actually find what ~’(a,t) must be in order that equation (2.33) be 
consistent with (2.29). We shall state the answer and then verify it; then we 
shall discuss the physical interpretation. The required (a, t) is 


(x, t) = expligx (a, t)] (a, t) (2.34) 


where x is the same space-time-dependent function as appears in equations 
(2.31) and (2.32). To verify this we consider 


(-iV—qgA')W’ = [-iV -qA-4(Vx)]lexp(iqx)¥] 
q(V x) exp(igy)b + exp(igy) - (-iVv) 
+ exp(igx) : (~4 AY) — (Vx) exp(igx)). (2.35) 


The first and the last terms cancel leaving the result: 


(—iV — qA')w’ = exp(iqx) « (-iV — gA)b (2.36) 
which may be written using equation (2.30) as: 
(iny y = expligy) - (-iDY). (2.37) 


Thus, although the space-time-dependent phase factor feels the action of the 
gradient operator V, it ‘passes through’ the combined operator D’ and con- 
verts it into D: in fact comparing the equations (2.34) and (2.37), we see that 
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D'w' bears to Dw exactly the same relation as %’ bears to w. In just the 
same way we find (cf equation (2.30)) 


(iD”4') = exp(igx) - (iD?) (2.38) 


where we have used equation (2.32) for V’. Once again, D°’y’ is simply related 
to D°w. Repeating the operation which led to equation (2.37) we find 


Las Lp 
iD PY = expligx) iD)? 
m 2m 
expliqx) - iD°w (using equation (2.29)) 
iD”y’ (using equation (2.30)). (2.39) 


Equation (2.39) is just (2.33) written in the D notation of equation (2.30), 
so we have verified that (2.34) is the correct relationship between w’ and 
w to ensure consistency between equations (2.29) and (2.33). Precisely this 
consistency is summarized by the statement that (2.29) is gauge covariant. 

Do w and 7’ describe the same physics, in fact? The answer is yes, but it 
is not quite trivial. It is certainly obvious that the probability densities |w|? 
and |ọ'|? are equal, since in fact w and y’ in equation (2.34) are related by 
a phase transformation. However, we can be interested in other observables 
involving the derivative operators V or 0/0t — for example, the current, which 
is essentially ~*(Vw) — (Vvy)*w. It is easy to check that this current is 
not invariant under (2.34), because the phase x(a,t) is aw-dependent. But 
equations (2.37) and (2.38) show us what we must do to construct gauge- 
invariant currents: namely, we must replace V by D (and in general also 
0/0t by D?) since then: 


Y” (D'4") = v* exp(—igx) - exp(igx) : (Dy) = y* Dy (2.40) 


for example. Thus the identity of the physics described by w and w’ is indeed 
ensured. Note, incidentally, that the equality between the first and last terms 
in (2.40) is indeed a statement of (gauge) invariance. 

We summarize these important considerations by the statement that the 
gauge invariance of Maxwell equations re-emerges as a covariance in quantum 
mechanics provided we make the combined transformation 


Aw>A'=A4+Vyx 
V3V=V—ay/ot (2.41) 


y — y = expliqx)y 


on the potential and on the wavefunction. 

The Schrödinger equation is non-relativistic, but the Maxwell equations are 
of course fully relativistic. One might therefore suspect that the prescriptions 
discovered here are actually true relativistically as well, and this is indeed 
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the case. We shall introduce the spin-0 and spin-4 relativistic equations in 


chapter 3. For the present we note that (2.30) can be written in manifestly 


Lorentz covariant form as 
D!” = OF + iqA" (2.42) 


in terms of which (2.37) and (2.38) become 
—iD'"~y! = exp(iqx) - (-iD"y). (2.43) 


It follows that any equation involving the operator 0“ can be made gauge 
invariant under the combined transformation 


AH AH = Al — 0x 
y > wp’ =exp(iqy)y 


if OY is replaced by D”. In fact, we seem to have a very simple prescription 
for obtaining the wave equation for a particle in the presence of an electro- 
magnetic field from the corresponding free particle wave equation: make the 


replacement 
OH + DY = OF + igA". (2.44) 


In the following section this will be seen to be the basis of the so-called ‘gauge 
principle’ whereby, in accordance with the idea advanced in the previous sec- 
tions, the form of the interaction is determined by the insistence on (local) 
gauge invariance. 

One final remark: this new kind of derivative 


D” = Ə! + ig A" (2.45) 


turns out to be of fundamental importance — it will be the operator which 
generalizes from the (Abelian) phase symmetry of QED (see comment (iii) 
of section 2.6) to the (non-Abelian) phase symmetry of our weak and strong 
interaction theories. It is called the ‘gauge covariant derivative’, the term 
being usually shortened to ‘covariant derivative’ in the present context. The 
geometrical significance of this term will be explained in volume 2. 


E 


2.5 The argument reversed: the gauge principle 


In the preceding section, we took it as known that the Schrödinger equation, 
for example, for a charged particle in an electromagnetic field, has the form 


= -iv — gA)? + qV| b = idw/at. (2.46) 
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We then checked its gauge invariance under the combined transformation 


A>A' = A+Vx 
V>V' = V-ôx/ðt (2.47) 
p> = exp(igx)y. 


We now want to reverse the argument: we shall start by demanding that our 
theory is invariant under the space-time-dependent phase transformation 


(a, t) + Y (x,t) = expligx (a, t)]o(a, t). (2.48) 


We shall demonstrate that such a phase invariance is not possible for a free 
theory, but rather requires an interacting theory involving a (4-vector) field 
whose interactions with the charged particle are precisely determined, and 
which undergoes the transformation 


A> A = A+Vx (2.49) 
VoV' = V-—dx/dt (2.50) 


when % — a’. The demand of this type of phase invariance will have then 
dictated the form of the interaction — this is the basis of the gauge principle. 

Before proceeding we note that the resulting equation — which will of course 
turn out to be (2.29) — will not strictly speaking be invariant under (2.48), 
but rather covariant (in the gauge sense), as we saw in the preceding section. 
Nevertheless, we shall in this section sometimes continue (slightly loosely) to 
speak of ‘local phase invariance’. When we come to implement these ideas 
in quantum field theory in chapter 7 (section 7.4), using the Lagrangian for- 
malism, we shall see that the relevant Lagrangians are indeed invariant under 
(2.48). 

We therefore focus attention on the phase of the wavefunction. The abso- 
lute phase of a wavefunction in quantum mechanics cannot be measured; only 
relative phases are measurable, via some sort of interference experiment. A 
simple example is provided by the diffraction of particles by a two-slit system. 
Downstream from the slits, the wavefunction is a coherent superposition of 
two components, one originating from each slit: symbolically, 


Y = Yı + po. (2.51) 


The probability distribution ||? will then involve, in addition to the separate 
intensities |1|? and |w2|?, the interference term 


2 Re(Ypy2) = 2|% ||p2] cos ð (2.52) 


where ô (= 6;—62) is the phase difference between components Yı and w2. The 
familiar pattern of alternating intensity maxima and minima is then attributed 
to variation in the phase difference 6. Where the components are in phase, 
the interference is constructive and ||? has a maximum; where they are out 
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of phase, it is destructive and |~|? has a minimum. It is clear that if the 
individual phases 6; and 62 are each shifted by the same amount, there will 
be no observable consequences, since only the phase difference 6 enters. 

The situation in which the wavefunction can be changed in a certain way 
without leading to any observable effects is precisely what is entailed by a 
symmetry or invariance principle in quantum mechanics. In the case under 
discussion, the invariance is that of a constant overall change in phase. In 
performing calculations it is necessary to make some definite choice of phase; 
that is, to adopt a ‘phase convention’. The invariance principle guarantees 
that any such choice, or convention, is equivalent to any other. 

Invariance under a constant change in phase is an example of a global 
invariance, according to the terminology introduced in the previous section. 
We make this point quite explicit by writing out the transformation as 


b> pl = ely 


global phase invariance. (2.53) 
a = constant 


That a in (2.53) is a constant, the same for all space-time points, expresses 
the fact that once a phase convention (choice of a) has been made at one 
space-time point, the same must be adopted at all other points. Thus in 
the two-slit experiment we are not free to make a local chance of phase: for 
example, as discussed by ’t Hooft (1980), inserting a half-wave plate behind 
just one of the slits will certainly have observable consequences. 

There is a sense in which this may seem an unnatural state of affairs. Once 
a phase convention has been adopted at one space-time point, the same con- 
vention must be adopted at all other ones: the half-wave plate must extend 
instantaneously across all of space, or not at all. Following this line of thought, 
one might then be led to ‘explore the possibility’ of requiring invariance under 
local phase transformations: that is, independent choices of phase convention 
at each space-time point. By itself, the foregoing is not a compelling mo- 
tivation for such a step. However, as we pointed out in section 2.3, such a 
move from a global to a local invariance is apparently of crucial significance 
in classical electromagnetism and general relativity, and seems now to provide 
the key to an understanding of the other interactions in the Standard Model. 
Let us see, then, where the demand of ‘local phase invariance’ 


w(x, t) > Y (x,t) = explia(ax, t)]y(a, t) local phase invariance (2.54) 


leads us. 

There is immediately a problem: this is not an invariance of the free- 
particle Schrödinger equation or of any free-particle relativistic wave equation! 
For example, if the original wavefunction ~(a,t) satisfied the free-particle 
Schrodinger equation 
+ (i) (a, t) = idw(a, t)/Ot (2.55) 


2m 
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then the wavefunction 7’, given by the local phase transformation, will not, 
since both V and 0/0t now act on a(x, t) in the phase factor. Thus local phase 
invariance is not an invariance of the free-particle wave equation. If we wish 
to satisfy the demands of local phase invariance, we are obliged to modify the 
free-particle Schrodinger equation into something for which there is a local 
phase invariance — or rather, more accurately, a corresponding covariance. 
But this modified equation will no longer describe a free particle: in other 
words, the freedom to alter the phase of a charged particle’s wavefunction 
locally is only possible if some kind of force field is introduced in which the 
particle moves. In more physical terms, the covariance will now be manifested 
in the inability to distinguish observationally between the effect of making a 
local change in phase convention and the effect of some new field in which the 
particle moves. 

What kind of field will this be? In fact, we know immediately what the 
answer is, since the local phase transformation 


p — y = explia(a, tY (2.56) 


with a = qx is just the phase transformation associated with electromagnetic 
gauge invariance! Thus we must modify the Schrödinger equation 


= (iV) =ið/ðt (2.57) 
to 1 
= gA)*b = (10/dt — qV) (2.58) 


and satisfy the local phase invariance 
bo! = explia(z, ty (2.59) 
by demanding that A and V transform by 


A> A'’=A+q'Va 


(2.60) 
V > V' =V —q`ta/ðt 
when 7 — w’. The modified wave equation is of course precisely the Schrödinger 
equation describing the interaction of the charged particle with the electro- 
magnetic field described by A and V. 

In a Lorentz covariant treatment, A and V will be regarded as parts of a 
4-vector A”, just as — V and 0/0t are parts of 0“ (see problem 2.1). Thus the 
presence of the vector field A”, interacting in a ‘universal’ prescribed way with 
any particle of charge q, is dictated by local phase invariance. A vector field 
such as A”, introduced to guarantee local phase invariance, is called a ‘gauge 
field’. The principle that the interaction should be so dictated by the phase 
(or gauge) invariance is called the gauge principle: it allows us to write down 
the wave equation for the interaction directly from the free particle equation 
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via the replacement (2.44)?. As before, the method clearly generalizes to the 
four-dimensional case. 


DE" 
2.6 Comments on the gauge principle in electromagnetism 
Comment (i) 


A properly sceptical reader may have detected an important sleight of hand in 
the previous discussion. Where exactly did the electromagnetic charge appear 
from? The trouble with our argument as so far presented is that we could 
have defined fields A and V so that they coupled equally to all particles — 
instead we smuggled in a factor q. 

Actually we can do a bit better than this. We can use the fact that the 
electromagnetic charge is absolutely conserved to claim that there can be no 
quantum mechanical interference between states of different charge q. Hence 
different phase changes are allowed within each ‘sector’ of definite q: 


wy’ = expliqgx)Y (2.61) 


let us say. When this becomes a local transformation, y > x(a,t), we shall 
need to cancel a term qV x, which will imply the presence of a ‘—qA’ term, 
as required. Note that such an argument is only possible for an absolutely 
conserved quantum number q — otherwise we cannot split up the states of 
the system into non-communicating sectors specified by different values of q. 
Reversing this line of reasoning, a conservation law such as baryon number 
conservation, with no related gauge field, would therefore now be suspected 
of not being absolutely conserved. 

We still have not tied down why q is the electromagnetic charge and not 
some other absolutely conserved quantum number. A proper discussion of 
the reasons for identifying A” with the electromagnetic potential and q with 
the particle’s charge will be given in chapter 7 with the help of quantum field 
theory. 


Comment (ii) 


Accepting these identifications, we note that the form of the interaction con- 
tains but one parameter, the electromagnetic charge q of the particle in ques- 
tion. It is the same whatever the type of particle with charge q, whether it 
be lepton, hadron, nucleus, ion, atom, etc. Precisely this type of ‘universal- 
ity’ is present in the weak couplings of quarks and leptons, as we shall see in 
volume 2. This strongly suggests that some form of gauge principle must be 


3 Actually the electromagnetic interaction is uniquely specified by this procedure only 
for particles of spin-0 or 4. The spin-1 case will be discussed in volume 2. 
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at work in generating weak interactions as well. The associated symmetry or 
conservation law is, however, of a very subtle kind. Incidentally, although all 
particles of a given charge q interact electromagnetically in a universal way, 
there is nothing at all in the preceding argument to indicate why, in nature, 
the charges of observed particles are all integer multiples of one basic charge. 


Comment (iii) 


Returning to comment (i), we may wish that we had not had to introduce the 
absolute conservation of charge as a separate axiom. As remarked earlier, at 
the end of section 2.2, we should like to relate that conservation law to the 
symmetry involved, namely invariance under (2.54). It is worth looking at the 
nature of this symmetry in a little more detail. It is not a symmetry which 
— as in the case of translation and rotation invariances for instance — involves 
changes in the space-time coordinates x and t. Instead, it operates on the 
real and imaginary parts of the wavefunction. Let us write 


w= Ur t+ ivr. (2.62) 


Then 
Y =e = pp tiny (2.63) 


can be written as 


Yr = (cosa)pR — (sin a) yy 
yi = (sina)yr + cos a) yy 


from which we can see that it is indeed a kind of ‘rotation’, but in the WR-vy 
plane, whose ‘coordinates’ are the real and imaginary parts of the wavefunc- 
tion. We call this plane an internal space and the associated symmetry an 
internal symmetry. Thus our phase invariance can be looked upon as a kind 
of internal space rotational invariance. 

We can imagine doing two successive such transformations 


(2.64) 


y> y y" (2.65) 
where , 
yY" = by! (2.66) 
and so 
y" = ellot4)y, = el? (2.67) 


with ô = a + 8. This is a transformation of the same form as the original one. 
The set of all such transformations forms what mathematicians call a group, 
in this case U(1), meaning the group of all unitary one-dimensional matrices. 
A unitary matrix U is one such that 


UU = UU =1 (2.68) 


where 1 is the identity matrix and t denotes the Hermitian conjugate. A 
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one-dimensional matrix is of course a single number — in this case a complex 
number. Condition (2.68) limits this to being a simple phase: the set of phase 
factors of the form e!*, where a is any real number, form the elements of a 
U(1) group. These are just the factors that enter into our gauge (or phase) 
transformations for wavefunctions. Thus we say that the electromagnetic 
gauge group is U(1). We must remember, however, that it is a local U(1), 
meaning (cf (2.54)) that the phase parameters a, 3,... depend on the space- 
time point x. 

The transformations of the U(1) group have the simple property that it 
does not matter in what order they are performed: referring to (2.65)—(2.67), 
we would have got the same final answer if we had done the £ ‘rotation’ first 
and then the a one, instead of the other way around; this is because, of course, 


exp(ia) - exp(i8) = exp[i(a + 2)] = exp(i9) - exp(ia). (2.69) 


This property remains true even in the ‘local’ case when a and 8 depend 
on x. Mathematicians call U(1) an Abelian group: different transformations 
commute. We shall see later (in volume 2) that the ‘internal’ symmetry spaces 
relevant to the strong and weak gauge invariances are not so simple. The 
‘rotations’ in these cases are more like full three-dimensional rotations of real 
space, rather than the two-dimensional rotation of (2.64). We know that, in 
general, such real-space rotations do not commute, and the same will be true 
of the strong and weak rotations. Their gauge groups are called non-Abelian. 

Once again, we shall have to wait until chapter 7 before understanding 
how the symmetry represented by (2.63) is really related to the conservation 
law of charge. 


Comment (iv) 


The attentive reader may have picked up one further loose end. The vector 
potential A is related to the magnetic field B by 


B=V xA. (2.70) 

Thus if A has the special form 
A=Vf (2.71) 
B will vanish. The question we must answer, therefore, is: how do we know 
that the A field introduced by our gauge principle is not of the form (2.71), 
leading to a trivial theory (B = 0)? The answer to this question will lead us 


on a very worthwhile detour. 
The Schrödinger equation with V f as the vector potential is 


1 2 
— (iV -qV = Ey. 2.72 
z iV —aV SVY = Ey (2.72) 
We can write the formal solution to this equation as 


vsen(ia f vra) vao (2.73) 
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which may be checked by using the fact that 


ð a 
ZJ f(t)dt = f(a). (2.74) 


The notation (f = 0) means just the free-particle solution with f = 0; the 
line integral is taken along an arbitrary path ending in the point x. But we 


have ar af af 
df = 2 de + ay P7 gV = VF dl. (2.75) 
Hence the integral can be done trivially and the solution becomes 
y = expliq(f(@) — f(—20))] - ¥(f = 0). (2.76) 


We say that the phase factor introduced by the (in reality, field-free) vector 
potential A = Vf is integrable: the effect of this particular A is merely 
to multiply the free-particle solution by an a-dependent phase (apart from 
a trivial constant phase). Since this A should give no real electromagnetic 
effect, we must hope that such a change in the wavefunction is also somehow 
harmless. Indeed Dirac showed (Dirac 1981, pp 92-3) that such a phase 
factor corresponds merely to a redefinition of the momentum operator p. The 
essential point is that (in one dimension, say) f is defined ultimately by the 
commutator (fi = 1) 


[@, p] =i. (2.77) 
Certainly the familiar choice 
o 
ñ = —i— 2.78 
la (2.78) 


satisfies this commutation relation. But we can also add any function of x 
to p, and this modified p will be still satisfactory since x commutes with 
any function of x. More detailed considerations by Dirac showed that this 
arbitrary function must actually have the form OF /0x, where F is arbitrary. 


Thus 0 ƏF 
p= —i— + — 2.79 
i Ox On ( ) 
is an acceptable momentum operator. Consider then the quantum mechanics 
defined by the wavefunction w(f = 0) and the momentum operator p = 


—i0/0x. Under the unitary transformation (cf (2.76)) 
W(f = 0) + 'V@u(f = 0) (2.80) 
p will be transformed to 
p> bf) peia e), (2.81) 


But the right-hand side of this equation is just p — gOf/Ox (problem 2.3), 
which is an equally acceptable momentum operator, identifying qf with the 
F of Dirac. Thus the case A = V f is indeed equivalent to the field-free case. 
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FIGURE 2.1 
Two paths Cı and Cə (in two dimensions for simplicity) from —oo to the point 
x. 


What of the physically interesting case in which A is not of the form V f? 
The equation is now 


47 =A = BY (2.82) 
2m 


to which the solution is 
£ 
w = exp (uf A. ar) -W(A =0). (2.83) 


The line integral can now not be done so trivially: one says that the A-field 
has produced a non-integrable phase factor. There is more to this terminology 
than the mere question of whether the integral is easy to do. The crucial point 
is that the integral now depends on the path followed in reaching the point æ, 
whereas the integrable phase factor in (2.73) depends only on the end-points 
of the integral, not on the path joining them. 

Consider two paths Cı and Cə (figure 2.1) from —oo to the point æ. The 
difference in the two line integrals is the integral over a closed curve C, which 
can be evaluated by Stokes’ theorem: 


T T 
A-dl— A-d=$A-d=/{/vxa-as=// Bas (2.84) 
Ĉi C2 (5 S S 


where S is any surface spanning the curve C. In this form we see that if A = 
V f, then indeed the line integrals over Cı and C2 are equal since V x V f = 0, 
but if B = Vx A is not zero, the difference between the integrals is determined 
by the enclosed flux of B. 

This analysis turns out to imply the existence of a remarkable phenomenon 
- the Aharonov-Bohm effect, named after its discoverers (Aharonov and Bohm 
1959). Suppose we go back to our two-slit experiment of section 2.5, only this 
time we imagine that a long thin solenoid is inserted between the slits, so 
that the components pı and p2 of the split beam pass one on each side of 
the solenoid (figure 2.2). After passing round the solenoid, the beams are 
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Wo 


FIGURE 2.2 
The Aharonov-Bohm effect. 


recombined, and the resulting interference pattern is observed downstream. 
At any point xæ of the pattern, the phase of the yı and Y2 components will be 
modified — relative to the B = 0 case — by factors of the form (2.83). These 
factors depend on the respective paths, which are different for the two com- 
ponents pı and Y2. The phase difference between these components, which 
determines the interference pattern, will therefore involve the B-dependent 
factor (2.84). Thus, even though the field B is essentially totally contained 
within the solenoid, and the beams themselves have passed through B = 0 
regions only, there is nevertheless an observable effect on the pattern provided 
B+#0! This effect — a shift in the pattern as B varies — was first confirmed ex- 
perimentally by Chambers (1960), soon after its prediction by Aharonov and 
Bohm. It was anticipated in work by Ehrenburg and Siday (1949); further 
references and discussion are contained in Berry (1984). 


Comment (v) 


In conclusion, we must emphasize that there is ultimately no compelling logic 
for the vital leap to a local phase invariance from a global one. The latter is, 
by itself, both necessary and sufficient in quantum field theory to guarantee 
local charge conservation. Nevertheless, the gauge principle — deriving inter- 
actions from the requirement of local phase invariance — provides a satisfying 
conceptual unification of the interactions present in the Standard Model. In 
volume 2 of this book we shall consider generalizations of the electromagnetic 
gauge principle. It will be important always to bear in mind that any at- 
tempt to base theories of non-electromagnetic interactions on some kind of 
gauge principle can only make sense if there is an exact symmetry involved. 
The reason for this will only become clear when we consider the renormaliz- 
ability of QED in chapter 11. 
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Problems 


2.1 


(a) 


(b) 


2.2 How many independent components does the field strength F#” have? 
Express each component in terms of electric and magnetic field components. 
Hence verify that equation (2.18) correctly reproduces both equations (2.1) 


A Lorentz transformation in the x! direction is given by 


t = y(t-v2') 
x 4(—vt + z!) 
a! = r2 x?! = x3 


> 


where y = (1 — v?)~!/? and c = 1. Write down the inverse of this 
transformation (i.e. express (t, x!) in terms of (t’, z1’)), and use the 
‘chain rule’ of partial differentiation to show that, under the Lorentz 
transformation, the two quantities (0/0t, -0/Ox') transform in the 
same way as (t, xt). 
[The general result is that the four-component quantity (0/0t, 
ð/ðx!, —0/0x?, —0/0x?) = (0/0t, -V) transforms in the same 
way as (t, 21,27, x°). Four-component quantities transforming this 
way are said to be ‘contravariant 4-vectors’, and are written with 
an upper 4-vector index; thus (0/0t,-V) = ð”. Upper indices 
can be lowered by using the metric tensor g,,, see appendix D, 
which reverses the sign of the spatial components. Thus ô” = 
(0/0t, 0/021, 0/Ox2,0/0x3). Similarly the four quantities (0/0t, V) 
= (0/0t, 0/O0z1, 0/Ox?, 0/Ox) transform as (t, —x!, —x?, —x?) and 
are a ‘covariant 4-vector’, denoted by 0,,.] 


Check that equation (2.5) can be written as (2.17). 


and (2.8). 
2.3 Verify the result 


a 
elt ©) feite) — 5 ase . 
x 
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Relativistic Quantum Mechanics 


It is clear that the non-relativistic Schrödinger equation is quite inadequate 
to analyse the results of experiments at energies far higher than the rest 
mass energies of the particles involved. Besides, the quarks and leptons have 
spin-$, a degree of freedom absent from the Schrödinger wavefunction. We 
therefore need two generalizations — from non-relativistic to relativistic for 
spin-0 particles, and from spin-0 to spin-$. The first step is to the Klein— 
Gordon equation (section 3.1), the second to the Dirac equation (section 3.2). 
Then after some further work on solutions of the Dirac equation (sections 3.3- 
3.4), we shall consider (section 3.5) some simple consequences of including the 


electromagnetic interaction via the gauge principle replacement (2.44). 


a 


3.1 The Klein—Gordon equation 


The non-relativistic Schrödinger equation may be put into correspondence 
with the non-relativistic energy-momentum relation 


E = p*/2m (3.1) 
by means of the operator replacements! 


E > id/at (3.2) 
p > iV, (3.3) 


these differential operators being understood to act on the Schrödinger wave- 
function. 

For a relativistic wave equation we must start with the correct relativistic 
energy-momentum relation. Energy and momentum appear as the ‘time’ and 
‘space’ components of the momentum 4-vector 


p” = (E, p) (3.4) 
which satisfy the mass-shell condition 
pP = pup" = E’ — p =m’. (3.5) 


TRecall h = c = 1 throughout (see appendix B). 
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Since energy and momentum are merely different components of a 4-vector, 
an attempt to base a relativistic theory on the relation 


E = +P +m)? (3.6) 


is unattractive, as well as having obvious difficulties in interpretation for the 
square root operator. Schrödinger, before settling for the less ambitious non- 
relativistic Schrödinger equation, and later Klein and Gordon, attempted to 
build relativistic quantum mechanics (RQM) from the squared relation 


E? =p +m’. (3.7) 
Using the operator replacements for EF and p we are led to 
—8"6/dt? = (V°? + m*)¢ (3.8) 


which is the Klein—Gordon equation (KG equation). We consider the case of a 
one-component scalar wavefunction (a, t): one expects this to be appropriate 
for the description of spin-0 bosons. 


3.1.1 Solutions in coordinate space 


In terms of the D’Alembertian operator 


= Lt o? 2 
=0,0"= ap vV (3.9) 
the KG equation reads: 
+m’)elz,t) = 0. (3.10) 


Let us look for a plane-wave solution of the form 
læ, t) = Net PS — Nee (3.11) 


where we have written the exponent in suggestive 4-vector scalar product 
notation 
p:z =p," = Et-p:x (3.12) 


and N is a normalization factor which need not be decided upon here (see sec- 
tion 8.1.1). In order that this wavefunction be a solution of the KG equation, 
we find by direct substitution that E must be related to p by the condition 


E? = p + m?. (3.13) 


This looks harmless enough, but it actually implies that for a given 3-momentum 
p there are in fact two possible solutions for the energy, namely 


E = (pP + m?) (3.14) 


As Schrödinger and others quickly found, it is not possible to ignore the nega- 
tive solutions without obtaining inconsistencies. What then do these negative- 
energy solutions mean? 
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3.1.2 Probability current for the KG equation 


In exactly the same way as for the non-relativistic Schrödinger equation, it 
is possible to derive a conservation law for a ‘probability current’ of the KG 
equation. We have 

OS _ 924 4.m26 =0 (3.15) 

Ot? E i 
and by multiplying this equation by ¢*, and subtracting ¢ times the com- 
plex conjugate of equation (3.15), one obtains, after some manipulation (see 
problem 3.1), the result 


p s 
aot Nigel) (3.16) 
where j ap 
pijet- (E) ean 
and 
j =i Vo -— (Vo )e] (3.18) 


(the derivatives (0,,6*) act only within the bracket). In explicit 4-vector no- 
tation this conservation condition reads (cf problem 2.1 and equation (D.4) 
in appendix D) 

OJ =0 (3.19) 


with 

je = (p, j) = ilg" ao — (00). (3.20) 
Since ¢ of (3.11) is Lorentz invariant and 0” is a contravariant 4-vector, equa- 
tion (3.20) shows explicitly that j is a contravariant 4-vector, as anticipated 
in the notation. 

The spatial current j is identical in form to the Schrödinger current, but 
for the KG case the ‘probability density’ now contains time derivatives since 
the KG equation is second order in 0/0t. This means that p is not constrained 
to be positive definite — so how can p represent a probability density? We can 
see this problem explicitly for the plane-wave solutions 


ġ = Ne eee (3.21) 


which give (problem 3.1) 
p=2|N/?E (3.22) 


and E can be positive or negative: that is, the sign of p is the sign of energy. 

Historically, this problem of negative probabilities coupled with that of 
negative energies led to the abandonment of the KG equation. For the mo- 
ment we will follow history, and turn to the Dirac equation. We shall see in 
section 3.4, however, how the negative-energy solutions of the KG equation 
do after all have a role to play, following Feynman’s interpretation, in pro- 
cesses involving antiparticles. Later, in chapters 5-7, we shall see how this 
interpretation arises naturally within the formalism of quantum field theory. 
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3.2 The Dirac equation 


In the case of the KG equation it is clear why the problem arose: 


(i) In constructing a wave equation in close correspondence with the 
squared energy-momentum relation 


E? = p* +m? 


we immediately allowed negative-energy solutions. 


(ii) The KG equation has a 0?/0t? term: this leads to a continuity 
equation with a ‘probability density’ containing 0/0t, and hence to 
negative probabilities. 


Dirac approached these problems in his characteristically direct way. In 
order to obtain a positive-definite probability density p > 0, he required an 
equation linear in 0/0t. Then, for relativistic covariance (see chapter 4), the 
equation must also be linear in V. He postulated the equation (Dirac 1928) 


awt) _ ( 


j; _——— 


ot 


l 
l 
Q 
= 


ð o ð 
Ari + a2 57D + az=) + Bm y(x, t) 
= (-ia-V + 8my(z,t). (3.23) 


What are the a’s and 8? To find the conditions on the a’s and 8, consider 
what we require of a relativistic wave equation: 


(i) the correct relativistic relation between F and p, namely 


E= +(p? + Te 


(ii) the equation should be covariant under Lorentz transformations. 


We shall postpone discussion of (ii) until the following chapter. To solve 
requirement (i), Dirac in fact demanded that his wavefunction w satisfy, in 
addition, a KG-type condition 


-8y / 3P = (V? + m? Jy. (3.24) 


We note with hindsight that we have once more opened the door to negative- 
energy solutions: Dirac’s remarkable achievement was to turn this apparent 
defect into one of the triumphs of theoretical physics! 

We can now derive conditions on a and 3. We have 


iðy/ðt = (—ia- V + Bm) (3.25) 
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and so, squaring the operator on both sides, 


2 
(5) y = (-ia-V+8m)(-ia:- V+ Bm) 


3 3 
ry Py 
2 2 _ “ays sys 
7 2 “DEP 2 (aio + 0900) prag 
5 
ee Ob, a2 a 
—imS (a:b + Bai) + Bm. (3.26) 
i=l 


But by our assumption that w also satisfies the KG condition, we must have 


a) 2 3 Oy 


i=l 


It is thus evident that the a’s and 6 cannot be ordinary, classical, commuting 
quantities. Instead they must satisfy the following anticommutation relations 
in order to eliminate the unwanted terms on the right-hand side of equation 
(3.26): 


aiß + bai = 0 11,23 (3.28) 
aja; taja = 0 49 152,35 2A. (3.29) 

In addition we require 
w= p=]. (3.30) 


Dirac proposed that the a’s and 8 should be interpreted as matrices, acting 
on a wavefunction which had several components arranged as a column vector. 
Anticipating somewhat the results of the next section, we would expect that, 
since each such component obeys the same wave equation, the physical states 
which they represent would have the same energy. This would mean that the 
different components represent some degeneracy, associated with a new degree 
of freedom. 

The degree of freedom is, of course, spin — an entirely quantum mechani- 
cal angular momentum, analogous to (but not equivalent to) orbital angular 
momentum. Consider, for example, the wavefunctions for the 2p state in the 
simple non-relativistic theory of the hydrogen atom. There are three of them, 
all degenerate with energy given by the n = 2 Bohr energy. The three corre- 
sponding states all have orbital angular momentum quantum number / equal 
to 1; they differ in their values of the ‘magnetic’ quantum number m (i.e. 
the eigenvalue of the z-component of the orbital angular momentum operator 
Ê). Specifically, these three wavefunctions have the form (omitting normal- 
ization constants) (r sin be'*, r sin 8e—'®, r cos @)e—"/?"8,, where rg is the Bohr 
radius. Remembering the expressions for the Cartesian coordinates x, y and z 
in terms of the spherical polar coordinates r, 0 and ¢, we see that by a suitable 
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linear combination (always allowed for degenerate states) we can write these 
wavefunctions as (x,y,z)f(r), where again a normalization factor has been 
omitted. In this form it is plain that the multiplicity of the p-state wavefunc- 
tions can be interpreted in simple geometrical terms: they are effectively the 
components of a vector (multiplication by the scalar function f(r) does not 
affect this). 

The several components of the Dirac wavefunction together make up a 
similar, but quite distinct, object called a spinor. We shall have more to say 
about this in chapter 4. For the moment we continue with the problem of 
finding the matrices a; and 8 to satisfy (3.28)—(3.30). 

As problem 3.2 shows, the smallest possible dimension of the matrices for 
which the Dirac conditions can be satisfied is 4 x 4. One conventional choice 


of the a’s and £ is 
0 Oi — 1 0 
wa(9%) pa(t 9) 2 


where we have written these 4 x 4 matrices in 2 x 2 ‘block diagonal’ form, the 
o;'s are the 2 x 2 Pauli matrices, 1 is the 2 x 2 unit matrix, and 0 is the 2 x 2 
null matrix. The Pauli matrices (see appendix A) are defined by 


Oz = ( a Ty = € J = a bi (3.32) 


Readers unfamiliar with the labour-saving ‘block’ form of (3.31) should verify, 
both by using the corresponding explicit 4 x 4 matrices, such as 


(3.33) 


and so on, and by the block diagonal form, that this choice does indeed satisfy 
the required conditions. These are 


{anb} = 0 (3.34) 
{anaj} = 26i;1 (3.35) 
p = A (3.36) 


where {A,B} is the anticommutator of two matrices, AB + BA, and 1 is 
here the 4 x 4 unit matrix. 

At this point we can already begin to see that the extra multiplicity is 
very likely to have something to do with an angular momentum-like degree of 
freedom. In fact, if we define the spin matrices S by S = $ø (h = 1), we find 
from (3.32) that 

[Se Sy] = iS, (3.37) 
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(with obvious cyclic permutations), which are precisely the commutation re- 
lations satisfied by the components Te J and J, of the angular momentum 
operator J in quantum mechanics (see appendix A). Furthermore, the eigen- 
values of S, are +4, and of S? are s(s + 1) with s = 4. So these matrices 
undoubtedly represent quantum mechanical angular momentum operators, 
appropriate to a state with angular momentum quantum number 7 = 4. This 
is precisely what ‘spin’ is. We will discuss this in more detail in section 3.3. 
It is important to note that the choice (3.31) of a and £ is not unique. In 
fact, all matrices related to these by any unitary 4 x 4 matrix U (which thus 


preserves the anticommutation relations) are allowed: 
a, = Uœ; U! (3.38) 
B = UBU. (3.39) 


Another commonly used representation is provided by the matrices 


a @ as j= (i a (3.40) 


The reader may check (problem 3.2) that these matrices also satisfy (3.34)- 
(3.36). 

Unless otherwise stated, we shall use the standard representation (3.31). 
This is generally convenient for ‘low energy’ applications — that is, when the 
momentum |p| is significantly smaller than the mass m. In that case, 8m will 
be the largest term in the Dirac Hamiltonian (see (3.23)), and it is sensible 
to have it in diagonal form. The choice (3.40), by contrast, is more natural 
when the mass is small compared with the energy or momentum. 


3.2.1 Free-particle solutions 


Since the Dirac Hamiltonian now involves 4 x 4 matrices, it is clear that we 
must interpret the Dirac wavefunction w as a four-component column vector — 
the so-called Dirac spinor. Let us look at the explicit form of the free-particle 
solutions. As in the KG case, we look for solutions in which the space-time 
behaviour is of plane-wave form and put 

Y = we P? (3.41) 
where w is a four-component spinor independent of x, and e~!?'*, with pë = 
(E, p), is the plane-wave solution corresponding to 4-momentum p”. We sub- 
stitute this into the Dirac equation 


iðy/ðt = (—ia- V + Bm) (3.42) 


using the explicit œ and 8 matrices. In order to use the 2 x 2 block form, it is 
conventional (and convenient) to split the spinor w into two two-component 


spinors ¢ and x: 
w= (2) ; (3.43) 
X 
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We obtain the matrix equation (see problem 3.3) 


e(%) - ee a ($) (3.44) 


representing two coupled equations for ¢ and y: 
(E—m)o =o - px (3.45) 


and 
(E + m)x =o - pọ. (3.46) 


Solving for y from (3.46), the general four-component spinor may be written 
(without worrying about normalization for the moment) 


$ 
Ww = o:-p š (3.47) 
AT 


What is the relation between F and p for this to be a solution of the Dirac 
equation? If we substitute x from (3.46) into (3.45) and remember that (prob- 
lem 3.4) 

(o: p) =p*1 (3.48) 


we find that 
(E — m)(E + m)¢ = pġ (3.49) 


for any ġ. Hence we arrive at the same result as for the KG equation in that 
for a given value of p, two values of E are allowed: 


= +(p’ + m?)1/? (3.50) 


i.e. positive and negative solutions are still admitted. 
The Dirac equation does not therefore solve this problem. What about 
the probability current? 


3.2.2 Probability current for the Dirac equation 
Consider the following quantity which we denote (suggestively) by p: 


p = 4# (x)(x). (3.51) 


Here yt is the Hermitian conjugate row vector of the column vector w. In 
terms of components 


Y2 
Y3 
Ya 
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SO 
4 
p=) Wa? > 0 (3.53) 
a=1 


and we see that p is a scalar density which is explicitly positive-definite. This 
is one property we require of a probability density: in addition, we require 
a conservation law, coming from the Dirac equation, and a corresponding 
probability current density. In fact (see problem 3.5) we can demonstrate, 
using the Dirac equation, 


id) /Ot = (-ia- V + Bm)w (3.54) 
and its Hermitian conjugate 
—iðyt = ot (tia. Y + Bm) (3.55) 
that there is a conservation law of the required form 
Op/Ot+V -j =0. (3.56) 


The notation wt? requires some comment: it is shorthand for three row 
matrices 


wt, = Out /ðx etc. 


(recall that Yt is a row matrix). 
In equation (3.56), with p being given by (3.51), the probability current 
density 7 is 
j(a) = yt (x)ax(a) (3.57) 


representing a 3-vector with components 


(Ward, p aap, p asy). (3.58) 


We therefore have a positive-definite p and an associated j satisfying the 
required conservation law (3.56), which, as usual, we can write in invariant 
form as ð j” = 0, where 


j” = (p, j). (3.59) 


Thus j” is an acceptable probability current, unlike the current for the KG 
equation — as we might have anticipated. 

The form of equation (3.56) implies that j” of (3.59) is a contravariant 
4-vector (cf equation (D.4)), as we verified explicitly in the KG case. The 
corresponding verification is more difficult in the Dirac case, since the Dirac 
spinor w transforms non-trivially under Lorentz transformations, unlike the 
KG wavefunction ¢. We shall come back to this problem in chapter 4. 

We now turn to further discussion of the spin degree of freedom, postponing 
consideration of the negative-energy solutions until section 3.4. 
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(a U 
3.3 Spin 


Four-momentum is not the only physical property of a particle obeying the 
Dirac equation. We must now interpret the column vector (Dirac spinor) 
part, w, of the solution (3.41). The particular properties of the o-matrices, 
appearing in the a@-matrices, have already led us to think in terms of spin. 
A further indication that this is correct comes when we consider the explicit 
form of w given in (3.47). In this equation the two-component spinor ¢ is 
completely arbitrary. It may be chosen in just two linearly independent ways, 


for example 
sh) e() em 


which (as the notation of course indicates) are in fact eigenvectors of S, = 402 
1 


with eigenvalues +5 (‘up’ and ‘down’ along the z-axis). Remember that, in 
quantum mechanics, linear combinations of wavefunctions can be formed using 
complex numbers as superposition coefficients, in general; so the most general 


@ can always be written as 


o= @ = ahr + bo, (3.61) 


where a and b are complex numbers. Hence, there are precisely two linearly 
independent solutions, for a given 4-momentum, just as we would expect for 
a quantum system with j = $ (the multiplicity is 2j + 1, in general). 

In the rest frame of the particle (p = 0) this interpretation is straightfor- 
ward. In this case choosing (3.60) for the two independent ¢’s, the solutions 
(3.61) for E = m reduce to 


1 
0 
0 (3.62) 
0 


(a) (b) 


Since we have degeneracy between these two solutions (both have E = m) 
there must be some operator which commutes with the energy operator, and 
whose eigenvalues would distinguish the solutions (3.62). In this case the 
energy operator is just 6m (from (3.54) setting —iV to zero, since p = 0) and 
the required operator commuting with £ is 


x, = K 2) (3.63) 


z 


which has eigenvalues 1 (twice) and —1 (twice). Our rest-frame spinors ap- 
pearing in (3.62) are indeed eigenstates of ,, with eigenvalues +1 as can be 
easily verified. 
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Generalizing (3.63), we introduce the three matrices where 


z= (5 ae (3.64) 


Then the operators ix are such that 

[iEn t2] = iE (3.65) 
and (4%)? = 3I where I is now the unit 4 x 4 matrix. These are just the 
properties expected of quantum-mechanical angular momentum operators (see 
appendix A) belonging to magnitude j = $ (we already know that the eigen- 
values of i5, are +3). So we can interpret su as spin-4 operators appropriate 
to our rest-frame solutions; and — at least in the rest frame — we may say that 
the Dirac equation describes a particle of spin-3. 

It seems reasonable to suppose that the magnitude of a spin of a particle 
could not be changed by doing a Lorentz transformation, as would be required 
in order to discuss the spin in a general frame with p 4 0. But iy is then 
no longer a suitable spin operator, since it fails to commute with the energy 
operator, which is now (a - p + 8m) from (3.54), for a plane-wave solution 
with momentum p. Yet there are still just two independent states for a given 
4-momentum as our explicit solution (3.47) shows: ¢ can still be chosen in 
only two linearly independent ways. Hence there must be some operator 
which does commute with a-p-+ 8m, and whose eigenvalues can be used to 
distinguish the two states. Actually this condition is not enough to specify 
such an operator uniquely, and several choices are common. One of the most 
useful is the helicity operator h(p) defined by 

P 
h(p) = _ (3.66) 

0 2o 

|p| 
which (see problem 3.6) does commute with a- p + 8m . We can therefore 
choose our general p # 0 states to be eigenstates of h(p). These will be 
called ‘helicity states’: physically they are eigenstates of © resolved along the 

direction of p. 

Using (3.48) it is easy to see that the eigenvalues of h(p) are +1 (twice) 
and —1 (twice). Our general four-component spinor (3.47) is therefore an 
eigenstate of h(p) if 


o-p 
— 0 
Pl ° a 
gj wee $ $ 
|p| E+m E+m 
Taking the + sign first, this will hold if 
o-p 
b+ = 4 (3.68) 


P| 
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where the + subscript has been added to indicate that this ¢ is a solution of 
(3.68). Such a d+ is called a two-component helicity spinor. The explicit form 
of ¢4 can be found by solving (3.68) — see problem 3.7. Similarly, the four- 
component spinor will be an eigenstate of h(p) belonging to the eigenvalue 
—1 if it contains @_ where 


op 
—¢o_ =-¢_. 3.69 
Dl $ (3.69) 


Again, these two choices ¢, and ġ— are linearly independent. 


-Á 


3.4 The negative-energy solutions 


In this section we shall first look more closely at the form of both the positive- 
and negative-energy solutions of the Dirac equation, and we shall then concen- 
trate on the physical interpretation of the negative-energy solutions of both 
the Dirac and the KG equations. 

It will be convenient, from now on, to reserve the symbol ‘E’ for the 
positive square root in (3.50): E = +(p? +m). The general 4-momentum in 
the plane-wave solution (3.41) will be denoted by p” = (p°, p) where p? may 
be either positive or negative. With this notation equation (3.44) becomes 


a (2) 7 Ge t) 6a (3.70) 


in our original representation for a and £. 


3.4.1 Positive-energy spinors 


For these 
po =+(p? +m)" = E> 0. (3.71) 
We eliminate x and obtain positive-energy spinors in the form 
gi? 
w@=N| op FERE (3.72) 
E+m 
with ¢''¢! = ¢?1¢? = 1. We shall now choose N so that for these positive- 


energy solutions wtw = 2E. In this case the spinors will be denoted by u(p, s), 
where (problem 3.8) 


u(p,s)=(E+m)/? | o-p r s=1,2 (3.73) 
E+m 
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and s labels the spin degree of freedom in some suitable way (e.g. the he- 
licity eigenvalues). The complete plane-wave solution w for such a positive 
4-momentum state is then 


Y = u(p, se P+ (3.74) 
with p4 = (E, p). 


3.4.2 Negative-energy spinors 
Now we look for spinors appropriate to the solution 


=p +m?) =-E<0 (3.75) 


(E is always defined to be positive). Consider first what are appropriate 
solutions at rest. We have now 


pP=-m p=0 (3.76) 
and 
ọ\_ [mi 0 (0) 
=m G =| ò =mi x (3.77) 
leading to 
ọ=0. (3.78) 
Thus the two independent negative-energy solutions at rest are just 
vip? =-m.s)=(.), (3.79) 
X 
The solution for finite momentum +p, i.e. for 4-momentum (— FE, p), is then 
-0 . Pp a 
X 
w(p? = —E,p,s)= | B+m (3.80) 
x? 


with y*!y* = 1. However, it is clearly much more in keeping with relativity 
if, in addition to changing the sign of E, we also change the sign of p and 
consider solutions corresponding to negative 4-momentum (—E,—p) = =p}. 
We therefore define 
o-p yl? 
w(p° = —E,-p,s)= t= N| Etm . (3.81) 


yl? 


Adopting the same N as in (3.73) implies the same normalization (ww = 
2E) for (3.81) as in (3.73); in this case the spinors are called v(p,s) where 
(problem 3.8) 
op , 
FoX 
v(p,s)= (E +m)! | E+m s=1,2. (3.82) 
X 
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FIGURE 3.1 
Energy levels for Dirac particle. 


(There is a small subtlety in the choice of x! and x? which we will come to 
shortly.) The solution w for such negative 4-momentum states is then 


Y = u(p, sje (P+) * = v(p, s)ei?+'*, (3.83) 


3.4.3 Dirac’s interpretation of the negative-energy solutions 
of the Dirac equation 


The physical interpretation of the positive-energy solution (3.74) is straight- 
forward, in terms of the p and j given in section 3.2.2. They describe spin-$ 
particles with 4-momentum (FE, p) and spin appropriate to the choice of ¢*; p 
and the energy p° are both positive. 

Unfortunately p is also positive for the negative-energy solutions (3.83), 
so we cannot eliminate them on that account. This means that for a free 
Dirac particle (e.g. an electron) the available positive- and negative-energy 
levels are as shown in figure 3.1. This, in turn, implies that a particle with 
initially positive energy can ‘cascade down’ through the negative-energy levels, 
without limit; in this case no stable positive-energy state would exist! 

In order to prevent positive-energy electrons making transitions to the 
lower, negative-energy states, Dirac postulated that the normal ‘empty’, or 
‘vacuum’, state — that with no positive-energy electrons present — is such that 
all the negative-energy states are filled with electrons. The Pauli exclusion 
principle then forbids any positive-energy electrons from falling into these 
lower energy levels. The ‘vacuum’ now has infinite negative charge and energy, 
but since all observations represent finite fluctuations in energy and charge 
with respect to this vacuum, this leads to an acceptable theory. For example, 
if one negative-energy electron is absent from the Dirac sea, we have a ‘hole’ 
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relative to the normal vacuum: 


energy of ‘hole’ 


Il 


—(Eneg) > positive energy 
charge of ‘hole’ = —(qe) — positive charge. 


Thus the absence of a negative-energy electron is equivalent to the presence of 
a positive-energy positively charged version of the electron, that is a positron. 
In the same way, the absence of a ‘spin-up’ negative-energy electron is equiva- 
lent to the presence of a ‘spin-down’ positive-energy positron. This last point 
is the reason for the subtlety in the choice of xë mentioned after (3.82): we 


choose 
1 0 2 1 
Xx = n = T (3.84) 


the opposite way round from the choice for the positive-energy spinors (3.73). 

Dirac’s brilliant re-interpretation of (unfilled) negative-energy solutions in 
terms of antiparticles is one of the triumphs of theoretical physics?: Carl 
Anderson received the Nobel Prize for his discovery of the positron in 1932 
(Anderson 1932). 

In this way it proved possible to obtain sensible results from the Dirac 
equation and its negative-energy solutions. It is clear, however, that the theory 
is no longer really a ‘single-particle’ theory, since we can excite electrons from 
the infinite ‘sea’ of filled negative-energy states that constitute the normal 
‘empty state’. For example, if we excite one negative-energy electron to a 
positive-energy state, we have in the final state a positive-energy electron plus 
a positive-energy positron ‘hole’ in the vacuum: this corresponds physically to 
the process of ete~ pair creation. Thus this way of dealing with the negative- 
energy problem for fermions leads us directly to the need for a quantum field 
theory. The appropriate formalism will be presented later, in section 7.2. 


3.4.4 Feynman’s interpretation of the negative-energy 
solutions of the KG and Dirac equations 


It is clear that despite its brilliant success for spin-4 particles, Dirac’s inter- 
pretation cannot be applied to spin-0 particles, since bosons are not subject to 
the exclusion principle. Besides, spin-0 particles also have their corresponding 
antiparticles (e.g. nrt and 77), and so do spin-1 particles (W* and W7, for 
instance). A consistent picture for both bosons and fermions does emerge 
from quantum field theory, as we shall see in chapters 5-7, which is perhaps 
one of the strongest reasons for mastering it. Nevertheless, it is useful to have 
an alternative, non-field-theoretic, interpretation of the negative-energy solu- 
tions which works for both bosons and fermions. Such an interpretation is due 


2At that time, this was not universally recognized. For example, Pauli (1933) wrote: 
‘Dirac has tried to identify holes with antielectrons. .. we do not believe that this explanation 
can be seriously considered.’ 
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to Feynman: in essence, the idea is that the negative 4-momentum solutions 
will be used to describe antiparticles, for both bosons and fermions. 

We begin with bosons — for example pions, which for the present purposes 
we take to be simple spin-0 particles whose wavefunctions obey the KG equa- 
tion. We decide by convention that the 7* is the ‘particle’. We will then 
have 


positive 4-momentum 7* solutions: Ne?” (3.85) 

negative 4-momentum 7” solutions: Ne”? (3.86) 

where p” = [(m? + p?)!/?, p]. The electromagnetic current for a free physical 

(positive-energy) 7? is given by the probability current for a positive-energy 
solution multiplied by the charge Q(= +e): 

ju,(mt) = (+e) x (probability current for positive energy 7*)(3.87) 

= (+¢)2|N/?[(m? +p), p] (3.88) 

using (3.20) and (3.85) (see problem 3.1). What about the current for the 7~? 


For free physical m~ particles of positive energy (m? + p>)" 2 and momentum 
p we expect 


jën (7) = (—e)2| NP (m? + p?)/?, p] (3.89) 
by simply changing the sign of the charge in (3.88). But it is evident that 
(3.89) may be written as 


Sben(™) = (+€)2|NP[-(m? + p°)", —p] (3.90) 


which is just j£,(7+) with negative 4-momentum. This suggests some equiv- 
alence between antiparticle solutions with positive 4-momentum and particle 
solutions with negative 4-momentum. 

Can we push this equivalence further? Consider what happens when a 
system A absorbs a 7? with positive 4-momentum p: its charge increases by 
+e, and its 4-momentum increases by p. Now suppose that A emits a physical 
ma with 4-momentum k, where the energy k? is positive. Then the charge 
of A will increase by +e, and its 4momentum will decrease by k. Now this 
increase in the charge of A could equally well be caused by the absorption 
of a nt — and indeed we can make the effect (as far as A is concerned) of 
the m~ emission process fully equivalent to a 7+ absorption process if we say 
that the equivalent absorbed 7* has negative 4-momentum, —k; in particular 
the equivalent absorbed 7+ has negative energy —k°. In this way, we view 
the emission of a physical ‘antiparticle’ m~ with positive 4-momentum k as 
equivalent to the absorption of a ‘particle’ 7* with (unphysical) negative 4- 
momentum —k. Similar reasoning will apply to the absorption of a a of 
positive 4-momentum, which is equivalent to the emission of a 7* of negative 
4-momentum. Thus we are led to the following hypothesis (due to Feynman): 


The emission (absorption) of an antiparticle of 4-momentum p” is physi- 
cally equivalent to the absorption (emission) of a particle of 4-momentum 
—p". 
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FIGURE 3.2 
Coulomb scattering of a 7~ by a static charge Ze illustrating the Feynman 
interpretation of negative 4-momentum states. 


In other words the unphysical negative 4-momentum solutions of the ‘particle’ 
equation do have a role to play: they can be used to describe physical processes 
involving positive 4-momentum antiparticles, if we reverse the role of ‘entry’ 
and ‘exit’ states. 

The idea is illustrated in figure 3.2, for the case of Coulomb scattering of a 
m~ particle by a static charge Ze, which will be discussed later in section 8.1.3. 
By convention we are taking m~ to be the antiparticle. In the physical process 
of figure 3.2(a) the incoming physical antiparticle 7~ has 4-momentum p;i, 
and the final 7~ has 4-momentum pr: both EF; and Ep are, of course, positive. 
Figure 3.2(b) shows how the amplitude for the process can be calculated using 
a solutions with negative 4-momentum. The initial state m~ of 4-momentum 
pi becomes a final state 7+ with 4-momentum —p;, and similarly the final state 
ma of 4-momentum pr becomes an initial state 7+ of 4-momentum —pr. Note 
that in this and similar figures, the sense of the arrows always indicates the 
‘flow’ of 4-momentum, positive 4-momentum corresponding to forward flow. 

It is clear that the basic physical idea here is not limited to bosons. But 
there is a difference between the KG and Dirac cases in that the Dirac equation 
was explicitly designed to yield a probability density (and probability current 
density) which was independent of the sign of the energy: 


p=vly  jadvlay. (3.91) 
Thus for any solutions of the form 
p = w(x, t) (3.92) 
we have 
p = wluld(a,t)? (3.93) 
and 


j = wlow|d(a, t)? (3.94) 
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and p > 0 always. We nevertheless want to set up a correspondence so that 
positive-energy solutions describe electrons (taken to be the ‘particle’, by con- 
vention, in this case) and negative-energy solutions describe positrons, if we 
reverse the sense of incoming and outgoing waves. For the KG case this 
was straightforward, since the probability current was proportional to the 
4-momentum: 


gh (KG) ~ p”. (3.95) 
We were therefore able to set up the correspondence for the electromagnetic 
current of t+ and 77: 


at: jea ~ ep positive energy m” (3.96) 


mT: jh, ~ (-e)p" positive energy m` (3.97) 
= (+e)(—p") negative energy T”. (3.98) 


This simple connection does not hold for the Dirac case since p > 0 for 
both signs of the energy. It is still possible to set up the correspondence, 
but now an extra minus sign must be inserted ‘by hand’ whenever we have a 
negative-energy fermion in the final state. We shall make use of this rule in 
section 8.2.4. We therefore state the Feynman hypothesis for fermions: 


The invariant amplitude for the emission (absorption) of an antifermion 
of 4-momentum p” and spin projection s, in the rest frame is equal to 
the amplitude (minus the amplitude) for the absorption (emission) of a 
fermion of 4-momentum —p" and spin projection —s, in the rest frame. 


As we shall see in chapters 5-7, the Feynman interpretation of the negative- 
energy solutions is naturally embodied in the field theory formalism. 


5 rrr 


3.5 Inclusion of electromagnetic interactions via the 
gauge principle: the Dirac prediction of g = 2 
for the electron 
Having set up the relativistic spin-0 and spin-4 free-particle wave equations, 
we are now in a position to use the machinery developed in chapter 2, in 
order to include electromagnetic interactions. All we have to do is make the 


replacement 
OY + DY = Ə!” + igA" (3.99) 


for a particle of charge q. For the spin-0 KG equation (3.10) we obtain, after 
some rearrangement (problem 3.9), 


(O+m?)\6 = —ig(0,A" + AXO,)6+ PAH (3.100) 
Vico. (3.101) 
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Note that the potential Va contains the differential operator 0,,; the sign of 
Vka isa convention chosen so as to maintain the same relative sign between 
V? and V as in the Schrödinger equation — for example that in (A.5). 

For the Dirac equation the replacement (3.99) leads to 


OY 
“OL 


where A! = (A°, A). The potential due to A” is therefore Vp = gA°1—qa- A, 
which is a 4 x 4 matrix acting on the Dirac spinor. 

The non-relativistic limit of (3.102) is of great importance, both physically 
and historically. It was, of course, first obtained by Dirac; and it provided, 
in 1928, a sensational explanation of why the g-factor of the electron had the 
value g = 2, which was then the empirical value, without any theoretical basis. 

By way of background, recall from appendix A that the Schrödinger equa- 
tion for a non-relativistic spinless particle of charge q in a magnetic field B 
described by a vector potential A such that B = V x A is 


[a - (-iV — qA) + Bm+qA°|~ (3.102) 


l eed q z q? Dois 
amy Y mP Ly+>5 A wy =i. (3.103) 


Taking B along the z-axis, the B - L term will cause the usual splitting (into 
states of different magnetic quantum number) of the (2/ + 1)-fold degeneracy 
associated with a state of definite l. In particular, though, there should be no 
splitting of the hydrogen ground state which has l = 0. But experimentally 
splitting into two levels is observed, indicating a two-fold degeneracy and thus 
(see earlier) a j = 4-like degree of freedom. 

Uhlenbeck and Goudsmit (1925) suggested that the doubling of the hy- 
drogen ground state could be explained if the electron were given an addi- 
tional quantum number corresponding to an angular-momentum-like observ- 
able, having magnitude 7 = 4. The operators S = io which we have already 
met serve to represent such a spin angular momentum. If the contribution to 
the energy operator of the particle due to its spin S enters into the effective 
Schrodinger equation in exactly the same way as that due to its orbital an- 
gular momentum, then we would expect an additional term on the left-hand 
side of (3.103) of the form 

TEL es (3.104) 
2m 
The corresponding wavefunction must now have two (spinor) components, 
acted on by the 2 x 2 matrices in S. 

The energy difference between the two levels with eigenvalues S, = +4 
would then be gB/2m in magnitude. Experimentally the splitting was found 
to be just twice this value. Thus empirically the term (3.104) was modified to 


-95—B Ss (3.105) 


where g is the ‘gyromagnetic ratio’ of the particle, with g ~ 2. Let us now see 
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how Dirac deduced the term (3.105), with the precise value g = 2, from his 
equation. 

To achieve a non-relativistic limit, we expect that we have somehow to 
reduce the four-component Dirac equation to one involving just two compo- 
nents, since the desired term (3.105) is only a 2 x 2 matrix. Looking at the 
explicit form (3.72) for the free-particle positive-energy solutions, we see that 
the lower two components are of order v (i.e. u/c with c = 1) times the upper 
two. This suggests that, to get a non-relativistic limit, we should regard the 
lower two components of w as being small (at least in the specific representa- 
tion we are using for œ and 8). However, since (3.102) includes the A“-field, 
this will have to be demonstrated (see (3.112)). Also, if we write the total 
energy operator as m + H 1, we expect H, to be the non-relativistic energy 


operator. 
y= F ) (3.106) 


We let 
where WV and ® are not free-particle solutions, and they carry the space-time 
dependence as well as the spinor character (each has two components). We 
set . 
Ay =a-(-iV—qA)+Bm+qA°-—m (3.107) 


where a 4 x 4 unit matrix multiplying the last two terms is understood. Then 


(8) = Gedo OE) (8) 
-m( p) +t (3) (3.108) 


Multiplying out (3.108), we obtain 


AW = o-(-iV—qA)®+qA°U (3.109) 
Ĥb = o-(-iV—qA)V +qA°®— 2me. (3.110) 


From (3.110), we obtain 
(A, — qA’ +2m)® = o - (-iV — qA. (3.111) 


So, if Hy (or rather any matrix element of it) is < m and if A? is positive or, 
if negative, much less in magnitude than m/e, we can deduce 


® ~ (velocity) x Y (3.112) 


as in the free case, provided that the magnetic energy ~ ø - A is not of order 

m. Further, if Hı « m and the conditions on the fields are met, we can drop 

H and qA? on the left-hand side of (3.111), as a first approximation, so that 
-(-iV — qA 

pe On ey (3.113) 


2m 
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Hence, in (3.109), 
` 1 
HW sigs (AV = gA)}?W + gA°w. (3.114) 
m 


The right-hand side of (3.114) should therefore be the non-relativistic energy 
operator for a spin-4 particle of charge q and mass m in a field A“. 

Consider then the case A° = 0 which is sufficient for the discussion of g. 
We need to evaluate 

{a -(-iV — qA)}?v. (3.115) 

This requires care, because although it is true that (for example) (ø: p)? = p°? 
if p = (Px,Py,Pz) are ordinary numbers which commute with each other, 
the components of ‘—iV — qA’ do not commute due to the presence of the 
differential operator V, and the fact that A depends on r. In problem 3.10 
it is shown that 


{o - (-iV — qA)}?U = (-iV — qA} Y — qo - BY. (3.116) 


The first term on the right-hand side of (3.116) when inserted into (3.114), 
gives precisely the spin-0 non-relativistic Hamiltonian appearing on the left- 
hand side of (3.103) (see appendix A), while the second term in (3.116) yields 
exactly (3.105) with g = 2, recalling that S = ø. Thus the non-relativistic 
reduction of the Dirac equation leads to the prediction g = 2 for a spin-$ 
particle. 

In actual fact, the measured g-factor of the electron (and muon) is slightly 
greater than this value: gexp = 2(1+ a). The ‘anomaly’ a, which is of order 
107? in size, is measured with quite extraordinary precision (see section 11.7) 
for both the e~ and et. This small correction can also be computed with 
equally extraordinary accuracy, using the full theory of QED, as we shall 
briefly explain in chapter 11. The agreement between theory and experiment is 
phenomenal and is one example of such agreement exhibited by our ‘paradigm 
theory’. 

It may be worth noting that spin-5 hadrons, such as the proton, have g- 
factors very different from the Dirac prediction. This is because they are, as 
we know, composite objects and are thus (in this respect) more like atoms in 
nuclei than ‘elementary particles’. 


ÁÁ]. 
Problems 
3.1 


(a) In natural units A = c = 1 and with 2m = 1, the Schrödinger 
equation may be written as 


V? + Vy — idw/dt = 0. 
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Multiply this equation from the left by ~* and multiply the complex 
conjugate of this equation by w (assume V is real). Subtract the 
two equations and show that your answer may be written in the 
form of a continuity equation 


apjat Vj =0 
where p = Y*y% and j = it Y* (Vy) — (VY. 


(b) Perform the same operations for the Klein-Gordon equation and 
derive the corresponding ‘probability’ density current. Show also 
that for a free-particle solution 


o = Nei”? 


with p” = (E, p), the probability current j” = (p, j) is proportional 
to př. 


3.2 


~ 
v 
a 


Prove the following properties of the matrices a; and (: 

(i) a; and 6 (i = 1,2,3) are all Hermitian [Hint: what is the 
Hamiltonian?]. 

(ii) Tra; = Tr? = 0 where ‘Ty’ means the trace, i.e. the sum of 


the diagonal elements [Hint: use Tr(AB) = Tr(BA) for any 
matrices A and B — and prove this tool]. 


(iii) The eigenvalues of a; and 6 are +1 [Hint: square a; and £]. 
(iv) The dimensionality of a; and 6 is even [Hint: the trace of a 
matrix is equal to the sum of its eigenvalues]. 


(b) Verify explicitly that the matrices a and 8 of (3.31), and of (3.40), 
satisfy the Dirac conditions (3.34) — (3.36). 
3.3 For free-particle solutions of the Dirac equation 
—ip-x 


w = we 


the four-component spinor w may be written in terms of the two-component 


spinors 
ge (2) l 
X 


iðy/ðt = (—ia V + Bm)b 


From the Dirac equation for w 


using the explicit forms for the Dirac matrices 


w=(56) = (5 A) 
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show that ¢ and x satisfy the coupled equations 


(E-m) = -px 
o: po 


R 
+ 
3 
o< 
l 


where p” = (E, p). 
3.4 


(a) Using the explicit forms for the 2 x 2 Pauli matrices, verify the 
commutation (square brackets) and anticommutation (braces) rela- 
tion [note the summation convention for repeated indices: €;;,0% = 


SS EijkOk]: 
(oi, gj] = 2icijkOk {oi, oj} = 26;;1 
where €;;x is the usual antisymmetric tensor 


—1 for an odd permutation of 1, 2, 3 


+1 for an even permutation of 1, 2, 3 
Eijk = 
0 if two or more indices are the same, 


i; is the usual Kronecker delta, and 1 is the 2 x 2 matrix. Hence 
show that 
O;0; = dig + lézie k 


(b) Use this last identity to prove the result 
(o-a)(o-b) =a-b1l+io-ax b. 


Using the explicit 2 x 2 form for 
Pa + tpy — Pz 


(o -p)? = pl. 


show that 


3.5 Verify the conservation equation (3.56). 


3.6 Check that h(p) as given by (3.66) does commute with a- p + 6m, the 
momentum-space free Dirac Hamiltonian. 


3.7 Let ¢ be an arbitrary two-component spinor, and let & be a unit vector. 


(a) Show that $(1 +o - &)¢ is an eigenstate of ø - ù with eigenvalue 
+1. The operator (1 +ø - ù) is called a projector operator for 
the o-u = +1 eigenstate since when acting on any ¢ this is what 
it ‘projects out’. Write down a similar operator which projects out 
the ø -&% = —1 eigenstate. 
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(b) Construct two two-component spinors ¢, and ¢_ which are eigen- 
states of o-û belonging to eigenvalues +1, and normalized to ¢!.¢, = 
ors for (r,s) = (+, —), for the case t = (sin 8 cos ¢, sin 8 sin ¢, cos 0) 
[Hint: take the arbitrary ¢ = (})]. 


3.8 Positive-energy spinors u(p, s) are defined by 


om 
u(p,s)=(E+m)? | o-p s=1,2 


S 


E+m 


with ¢°'¢° = 1. Verify that these satisfy utu = 2E. 
In a similar way, negative-energy spinors v(p, s) are defined by 
o: p x 
v(p,s)=(E+m) P | E+m s=1,2 
x? 


with y*!y° = 1. Verify that v'v = 2E. 


3.9 Using the KG equation together with the replacement 0" — Ə” + iqgA", 
find the form of the potential Vka in the corresponding equation 


(O +m?) = —Vad 
in terms of A”. 
3.10 Evaluate 
{o (iV — qA)’ Y 
by following the subsequent steps (or doing it your own way): 


(a) Multiply the operator by itself to get 
{(o--iV)* +ig(a-V)(a- A) +iqlo : A)lo -VV)+ 0- A)? Y. 


The first and last terms are, respectively, —V? and q? A’? where the 
2 x 2 unit matrix 1 is understood. The second and third terms are 
iqlo - V)(o - Aw) and ig(o - A)(o - Vw). These may be simplified 
using the identity of problem 4.4(b), but we must be careful to treat 
V correctly as a differential operator. 

(b) Show that (o-V)(o-A)wv = V - (Av) +io- {V x (Aw)}. Now use 
V x (Av) =(V x A) — A x Vy to simplify the last term. 

(c) Similarly, show that (o - A)(o - V \y = A -Vy +io - (A x Vy). 

(d) Hence verify (3.116). 


A 


Lorentz Transformations and Discrete 
Symmetries 


In this chapter we shall review various covariances (see appendix D) of the KG 
and Dirac equations, concentrating mainly on the latter. First, we consider 
Lorentz transformations (rotations and velocity transformations) and show 
how the scalar KG wavefunction and the 4-component Dirac spinor must 
transform in order that the respective equations be covariant under these 
transformations. Then we perform a similar task for the discrete transforma- 
tions of parity, charge conjugation and time reversal. The results enable us 
to construct ‘bilinear covariants’ having well-defined behaviour (scalar, pseu- 
doscalar, vector, etc.) under these transformations. This is essential for later 
work, for two reasons: first, we shall be able to do dynamical calculations in a 
way that is manifestly covariant under Lorentz transformations; and secondly 
we shall be ready to study physical problems in which the discrete transfor- 
mations are, or are not, actual symmetries of the real world, a topic to which 
we shall return in the second volume. 


SSS 
4.1 Lorentz transformations 


4.1.1 The KG equation 


In order to ensure that the laws of physics are the same in all inertial frames, 
we require our relativistic wave equations to be covariant under Lorentz trans- 
formations — that is, they must have the same form in the two different frames 
(see appendix D). In the case of the KG equation 


(A + m?)d(x) = —ig[0,, A" (x) + A” (x) lolx) + g? A? (x) d(x) (4.1) 
for a particle of charge q in the field A”, this requirement is taken care of, 
almost automatically, by the notation. Consider a Lorentz transformation 
such that x — 2’. A” will transform by the usual 4-vector transformation 
law (i.e. like x), which we write as A”(x) > A’#(2’). Similarly we write 
the transform of ¢ as (x) > ¢/(a’). Then in the primed coordinate frame 
physics must be described by the equation 


(C! + m*)o!(2") = —ig(d,,A’*(2') + A (x) lp (2) + APG a). (4.2) 
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Now the 4-dimensional dot products appearing in (4.2) are all invariant under 
the Lorentz transformation, so that (4.2) can be written as 


(O + m*)¢'(x') = —iglð A" (2) + A” (x)3 l9 (2) + A? (2) (2), (4.3) 


and we see that the wavefunction in the primed frame may be identified (up 
to a phase) with that in the unprimed frame: 


g'(x) = (2). (4.4) 


Equation (4.4) is the condition for the KG equation to be covariant under 
Lorentz transformations. Since x’ is a known function of x, given by the 
angles and velocities parametrizing the transformation, equation (4.4) enables 
one to construct the correct function ¢’ which the primed observers must use, 
in order to be consistent with the unprimed observers. 

By way of illustration, consider a rotation of the coordinate system by an 
angle a in a positive sense about the x-axis; then the position vector referred 
to the new system is a’ = (x', y’, 2’) where 


a’ 1 0 0 x 
y | ={0 cosa sina y |, (4.5) 
z! 0 — sina cosa z 
which we shall write as 
x’ =R,(a) z. (4.6) 
Correspondingly, equation (4.4) is, in this case, 
¢ (Rala) x) = (x), (4.7) 
which can also be written as 
¢ (x) = ¢(Rz (a) x). (4.8) 


It is convenient to begin with an ‘infinitesimal rotation’, where the angle 
a in (4.5) is replaced by €y such that cose, ~ 1 and sine, ~ €x. Then it is 
easy to verify that (4.5) becomes 


x' =R,(e,)@=2-EXL (4.9) 


where e = (€,,0,0). For a general infinitesimal rotation, we simply replace this 
e by a general one, (€x, €y, €z). For such a rotation, condition (4.8) becomes 


(x) =x +e x z). (4.10) 
Expanding the right hand side to first order in € we obtain 


d(x) = g(x) +(exx)-Vo=d(x) +e: (xx V)d 
= (l+ie-L)d(x) (4.11) 


where L is the vector angular momentum operator Œ x —iV. 
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The rule for finite rotations may be obtained from the infinitesimal form 

by using the result 
e^ = lim (1+ A/n)” (4.12) 

n— o0 

generalized to differential operators (the exponential of a matrix being un- 
derstood as the infinite series expA = 1 + A+ 1A? +...). Let € = a/n, 
where @ = (œx, Qy, @&z) are three real finite parameters; we may think of the 
direction of œ as representing the axis of the rotation, and the magnitude of 
q@ as representing the angle of rotation. Then applying the transformation 
(4.11) n times, and letting n tend to infinity, we obtain for the finite rotation 


8 (a) =e Lye) = Gg(a)d(a). (4.13) 


Note that Up(q) is a unitary operator, since Uh is the inverse rotation. 

Equation (4.13) is, of course, the familiar rule for rotations of scalar wave- 
functions, exhibiting the intimate connection between rotations and angular 
momentum in quantum mechanics. We recall that if a Hamiltonian is invari- 
ant under rotations, then the operators L commute with the Hamiltonian and 
angular momentum is conserved. 

A similar calculation may be done for velocity transformations (‘boosts’), 
leading to corresponding operators K — see problem 4.1. 


4.1.2 The Dirac equation 


The case of the Dirac equation is more complicated, because (unlike the KG ¢) 
the wavefunction has more than one component, corresponding to the fact that 
it describes a spin-1/2 particle. There is, however, a direct connection between 
the angular momentum associated with a wavefunction, and the way that the 
wavefunction transforms under rotations of the coordinate system. To take a 
simple case, the 2p wavefunctions mentioned in section 3.2 correspond to l = 1 
on the one hand and, on the other, to the components of a vector — indeed the 
most basic vector of all, the position vector æ = (x,y,z) itself. If we rotate 
the coordinate system in the way represented by (4.5), the components in the 
primed system transform into simple linear combinations of the components 
in the original system. 

Very much the same thing happens in the case of spinor wavefunctions, 
except that they transform in a way different from — though closely related to 
— that of vectors. In the present section we shall discuss how this works for 
three-dimensional rotations of the spatial coordinate system, and explain how 
it generalizes to boosts, which include transformations of the time coordinate 
as well. It will be convenient to use the alternative representation (3.40) for 
the Dirac matrices. In this representation, the components ¢, x of the free- 
particle 4-spinor w of (3.43) satisfy 


Eé = o-pd+myx (4.14) 
Ex = -o-px+mé¢ (4.15) 
rather than (3.45) and (3.46). 
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As before, we start with the infinitesimal rotation (4.9). Since p is a vector, 
it transforms in the same way as a, so that under an infinitesimal rotation p 
becomes p’ where 
p =p—€exXp. (4.16) 
The question for us now is: how do the spinors ¢ and x transform under this 
same rotation of the coordinate system? 
The essential point is that in the new coordinate system the defining equa- 
tions (4.14) and (4.15) should take exactly the same form, namely 
Ed’ = o-p'd'+my’ (4.17) 
Ex = -o-p'x'+m¢' (4.18) 
where ¢’ and x’ are the spinors in the new coordinate system, and we have 
used the fact that both E and m do not change under rotations. Our task is 
to find ¢ and x’ in terms of ¢ and y. 
Since both ¢ and y are 2-component spinors, we might guess from (4.11) 
that the answer is 


o =(1+io-«/2)¢, x =(1t+io-«€/2)x, (4.19) 


since the 0/2 are the spin-1/2 matrices, taking the place of L. To check that 
this is, in fact, the correct transformation law, we proceed as follows.! First, 
multiply (4.14) from the left by the matrix (1 + io - €/2): then, since E and 
m commute with all matrices, the result is 


E = (1t+io-€/2)o-p¢+my’ (4.20) 
(1+io-€/2)o-p(l —io- €/2)¢' + my’ (4.21) 


where we have used 
(1+io -€/2)~' ~ (1 — iø - €/2) (4.22) 


to first order in e. Keeping only first order terms in e€, the first term on the 
right hand side of (4.21) is 


il 1 
(o-ptsio-eo-p—sio-pa-e)d. (4.23) 
This can be simplified using the result from problem 3.4(b): 
g:ao-b=a-b+ia-axb, (4.24) 


provided all the components of a and 6 commute. Applying (4.24), (4.23) 
becomes 


[optil ptio exp) -ile ptio px og (4.25) 
=(0-p—a-exp)d =a-p'¢’. (4.26) 


'We shall derive (4.19), and the corresponding rule for velocity transformations, equation 
(4.42) below, in appendix M of volume 2 using group theory. 
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Hence (4.21) is just 
Eg =o -p'd +my’ (4.27) 


as required in (4.17). We can similarly check the correctness of the transfor- 
mation law (4.19) for x. 

The transformation rule for a finite rotation may be obtained from the 
infinitesimal form by using the result (4.12) applied to matrices. Then for a 
finite rotation we obtain the result 


g =exp(io-a/2)¢, x =exp(io - a/2) x. (4.28) 


We note that the behaviour of ¢ and x under rotations is the same: equation 
(4.28) is the way all 2-component spinors transform under rotations. 

By way of an illustration, consider the case of the finite rotation (4.5). 
Here a = (a,0,0), and the transformation matrix is 


1 
exp(io,a/2) = 1 + ioza/2 + 5 (ignee/2)° +... (4.29) 


Multiplying out the terms in (4.29) and remembering that go? = 1, we see that 
the transformation matrix is 


(4.30) 


cos a/2 + io, sin a/2 = ( cosa/2 isina/2 ) 


isina/2 cosa/2 


This means that the components ģ1, 2 of the spinor ¢@ transform according 
to the rule 


¢, = cosa/2 ġı +isina/2 œz (4.31) 
ġa = isina/2 6, +cosa/2 do, (4.32) 


for this particular rotation. The transformed components are linear combina- 
tions of the original components, but it is the half-angle a/2 that enters, not 
a. 

Let us denote the finite transformation matrix by U, so that 


U =exp(ic-a/2) and Ut =exp(—io - a/2). (4.33) 


It follows that 
UU =U'U =1, (4.34) 


since the rotation parametrized by —a clearly undoes the rotation parametrized 
by a. So U is a 2 x 2 unitary matrix. It follows that the normalization of 
@ and x is preserved under rotations: ¢''¢’ = ġġ, and yy’ = xix. The 
free-particle Dirac probability density p = yty = ġġ + xtx is therefore also 
(as we expect) invariant under rotations. 

More interestingly, we can examine the way the free-particle current den- 
sity 

j=Vlap = blo¢—xlox (4.35) 
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transforms under rotations. Of course, it should behave as a 3-vector, and 
this is checked in problem 4.2(a). 

We now turn to the behaviour of the spinors ¢ and y under boosts, which 
mix x and t, or equivalently p and E. For example, consider a Lorentz 
velocity transformation (boost) from a frame S to a frame S’ which is moving 
with speed u with respect to S along the common x-axis. Then the energy E 
and momentum px of a particle in S are transformed to E’ and p’, in S’ where 
(cf (D.1)) 


E’ = cosh’ E-sinh? pr (4.36) 
p, = cosh¥ py — sinh? E, (4.37) 
where cosh? = (1 — u?)~!/? = y(u), and sinh¥ = 7(u)u. As before, we 
start with an infinitesimal transformation, where Vv is replaced by 7, such 
that coshyn, ~ 1 and sinhn, ~ Ns. Then (4.36) and (4.37) become E’ = 
E — NPr, Di, = Px —NxE. For the general infinitesimal boost parametrized 
by N = (Ne, y, nz), the transformation law for (E, p) is 
E' = E-7-p (4.38) 
p = p-—mne. (4.39) 
Once again, we have to determine ¢’ and x’ such that the transformed versions 
of (4.14) and (4.15) are 
(E’ —o- p')¢’ = my’ (4.40) 
(E'+a-p')y’ = mg. (4.41) 
Note that this time E does transform, according to (4.38). 
The required ¢’ and x’ are 


¢' =(1-o-n/2)6, X = (1+. e-m/2)x. (4.42) 


The spinors ¢ and x behaved the same under rotations, but they transform 
differently under boosts. There are two kinds of 2-component spinors, ¢-type 
and y-type, in the representation (3.40), which are distinguished by their 
behaviour under boosts. The group theory behind this will be explained in 
appendix M of volume 2. 

To verify the rule (4.42), take equation (4.14) in the form (4.40) and mul- 
tiply from the left by the matrix (1 + 0-7/2), to obtain 


(1+0-n/2)(E—o-p)o= my, (4.43) 
or equivalently 
(l+o-n/2)(E-o-p)(l+o-n/2)¢' =my’, (4.44) 


where we have used (1—o0-7/2)~! ~ (1+0 -n/2). For (4.44) to be consistent 
with (4.40) we require 


(l+o-n/2)(E-o-p)\(l+o-n/2)=E'-oa-p’. (4.45) 
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Keeping only first order terms in n, the left hand side of (4.45) is 


1 
E-o-pt+Ho-n- Z(o-pao-nto-no-p) (4.46) 
=E-n-p—o-(p—mE) (4.47) 
=P -o.p (4.48) 


as required for the right hand side of (4.45). 

For a finite boost ¢ and x transform by the ‘exponentiation’ of (4.42), 
namely 

g = exp(-o 8/2) 4, xX = explo - 9/2) x (4.49) 

where the three real parameters V = (Vy, Oy, Vz) specify the direction and 
magnitude of the boost. In contrast to (4.28), the transformations (4.49) are 
not unitary. If we denote the matrix exp(—o - 0/2) by B, we have B = BÝ 
rather than B~! = B'. So B does not leave ¢'¢ and yly invariant. Actually 
this is no surprise. We already know from section 4.1.2 that the density 
¢'¢+x'x ought to transform as the fourth component p of the 4-vector 
j” = (p, j). Let us check this for our infinitesimal boost: 


o = oto +x'ty! 
= p(l-a-n/2)\(l—o-n/2)6+ x) (1+ 0-n/2)(1+0-n/2) x 
= $'ot+ xly-dlad-ntxlox-n 
= pemi (4.50) 


as required by (4.38). Similarly, it may be verified (problem 4.2(b)) that j 
transforms as the 3-vector part of the 4-vector j”, under this infinitesimal 
boost. 

On the other hand, the products ¢'y and yt¢ are clearly invariant under 
the transformation (4.49), since the exponential factors cancel. This means 
that the quantity wt 8w is a Lorentz invariant. 

At this point it is beginning to be clear that a more ‘covariant-looking’ 
notation would be very desirable. In the case of the KG probability current, 
the 4-vector index jz was clearly visible in the expression on the right-hand side 
of (3.20), but there is nothing similar in the Dirac case so far. In problem 4.3 
the four ‘y matrices’ are introduced, defined by y“ = (4°, y) with y° = 6 and 
y = Ba, together with the quantity y = ~'7°, in terms of which the Dirac 
p of (3.51) and j of (3.57) can be written as W(x)y°(x) and o(x)yu(x) 
respectively. The complete Dirac 4-current is then 


j” = (2) 7"4(2). (4.51) 


For free particle solutions, we (and problem 4.2) have established that j” 
of (4.51) indeed transforms as a 4-vector under infinitesimal rotations and 
boosts. We have also just seen that the quantity wy is an invariant. 

We end this section by illustrating the use of the finite boost transforma- 
tions (4.49). Consider two frames S and S’, such that in S a particle is at rest 
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with E = m, p = 0, and with spin up along the z-axis; in S’, the particle has 
energy E’, momentum p’ = (0,0, p’), and spin up along the z-axis. If we apply 
a boost such that S” has velocity (0,0,—v’) relative to S, where v’ = p’/E’, 
then E and p become 


E' = cosh E = mq(v') 4.52) 
p = sinh’ E = mv'y(v’) (4.53) 
as required. Now consider the forms of the 4-spinors in S and S’. In S, 


from (4.14) and (4.15) we have simply ¢ = x, and if we normalize such that 
uu = 2m we may take 


us= vm $* ), = (5): (4.54) 


In S’ the spinor is 


P+ P+ 
wow aa a (ayy) Om 


where the normalization N is determined (since tu is invariant) from the 
condition tig'ug: = 2m to be N = (BE! + p’)!/?, giving 


B (E! + p)? by 
usr = ( (E’ — p')!/2 by ) š (4.56) 


But we can also calculate us, by applying the transformation (4.49) with 
tanh Y” = —v' to ug. Then the upper two components become 


P = Vim 0976, = ym ee? 4, (4.57) 
while the lower two components become 
x = Vm e/o. (4.58) 


Now we can write 


E! 1 1/2 
=<") (4.59) 


m 


oe? /? = (e®") 1/2 = (cosh Y + sinh w’)!/? = ( 


and 
; EF! —y! 1/2 
e7” = [= ) ; (4.60) 


and so we recover (4.56). 
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a 


4.2 Discrete transformations: P, C and T 


The transformations we considered in section 4.1 are known as ‘continuous’, 
because the parameters involved (angles, speeds) vary continuously. This is 
essentially the reason we were able to build up finite transformations from 
infinitesimal ones, which differ only slightly from the identity transformation: 
finite transformations could be reached continuously from the identity. But 
there is another class of transformations, called ‘discrete’, which cannot be 
reached continuously from the identity. Examples of discrete transformations 
are parity (or space inversion), charge conjugation, and time reversal, and 
their combinations. Although these discrete transformations are important 
primarily in weak interactions, which we shall not cover until the second vol- 
ume, it is useful to discuss the behaviour of Dirac wavefunctions under discrete 
transformations at this stage. Among other things, more light will be cast on 
antiparticles. 


4.2.1 Parity 
The parity (or space inversion) transformation P is defined by 


P:£> r =-r 


, tot (4.61) 


that is, P inverts the spatial coordinates. It follows that P also inverts mo- 
menta (p — —p) but does not change angular momenta (a x p — x x p) or 
spin (o — ø). We already see that there are two kinds of 3-vectors: polar 
3-vectors which change sign under P and axial vectors which do not. For ex- 
ample, the electric field E and the vector potential A are polar vectors, while 
the magnetic field B is an axial vector. There are also scalar quantities (such 
as x- p) which do not change sign under P, and pseudoscalar quantities (such 
as ø - p) which do. 

Consider first the KG equation (4.1). Since A is a polar vector, it changes 
sign under parity, as does V, while both 0/0t and A? remain the same. The 
scalar products „A“ and A”ð, are therefore invariant under parity, as are 
and A?. Hence we may identify ¢p(2x’) = ¢(x), or equivalently 


p(x) = $(-a) = Pod(a), (4.62) 


where Po is the coordinate inversion operator. Note that we are calling the 
transformed wavefunction ¢p rather than yet another ¢’ since we need to 
keep track of what transformation we are considering. If we take ¢(a) to be 
a positive-energy free particle solution with energy EF and momentum p, dp 
will describe a positive energy particle with momentum —p, as we expect. 
Now let us study the covariance of the free particle Dirac equation 
O(a, t) _ 
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under P. Equation (4.63) will be covariant under (4.61) if we can find a 
wavefunction Yp (x',t) for observers using the transformed coordinate system 
such that their Dirac equation has exactly the same form in their system as 
(4.63): 


jue ——(a2’,t) = —ia- V'vp(2’, t) + Bmap(a2’, t). (4.64) 
Now we know that V’ = —V, since x’ = —a. Hence (4.64) becomes 
jour 


—— (x', t) =ia- Vyp(x', t) + Bmp (a2, t). (4.65) 


Multiplying this equation from the left by 6 and using Ga = —af we find 


5 [bv (z, t)] = —ia- V [bye (x, t)] + 8ml bYe (x,t). (4.66) 


Comparing (4.66) and (4.63), it follows that we may consistently translate 
between 7 and wp using the relation 


y(x, t) = Byp(—a,t), (4.67) 


or equivalently i 


Equation (4.68) is the required relation between the wavefunctions in the two 
systems; it may be compared to (4.4) and (4.62). 

In principle we could include an arbitrary phase factor np on the right 
hand of (4.68) and (4.62); such a phase leaves the normalization of ¢ and 4, 
and all bilinears of the form 7 (gamma matrix) Y unaltered. The possibility 
of such a phase factor did not arise in the case of Lorentz transformations, 
since for infinitesimal ones the transformed y’ and the original w differ only 
infinitesimally (not by a finite phase factor). But the parity transformation 
cannot be built up out of infinitesimal steps — the coordinate system is either 
reflected or it is not. We will choose np = 1. 

As an example of (4.68), consider the free particle solutions in the standard 
form (3.41), (3.72): 


yle, t) =N (o > ) exp(—iEt + ip- x). (4.69) 
oP g 
Then 
wp(x,t) = by(—z,t) =N ( w ) exp(—iEt — ip: x) (4.70) 
Erm? 


which can be conveniently summarized by the simple statement that the three- 
momentum p as seen in the parity transformed system is minus that in the 
original one, as expected. Note that ø does not change sign. 
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It is also interesting to look at the behaviour of the spinors ¢ and y in the 
representation (3.40), where they satisfy the equations (4.14) and (4.15). Un- 
der parity p + —p, so we can immediately see that ¢p = x and xp = ¢. Thus 
the 2-component spinors ¢ and y are (in this representation) interchanged un- 
der parity. 

The analysis leading to (4.68) may be extended to the case of the Dirac 
equation (3.102) for a particle of charge q in the field A“. As already noted, 
A is a polar vector, transforming under like x or V; the scalar potential A° is 
invariant under parity. The combination (—iV — qA) therefore changes sign 
under parity, and the manipulations following (4.65) proceed as before. 

We may introduce a corresponding parity operator P, which is unitary 
and acts on wavefunctions so as to change ~ into wp; then 


Py(z, t) = By(—2, t) = BPov(z, t), (4.71) 


so that 
P = Po. (4.72) 


Applying Ê twice, we find 
P?y(x, t) = y(x, t) (4.73) 


which implies that the eigenvalues of P are 1. 

For example, the positive energy rest-frame spinors ((3.73) with p = 0)) 
are eigenstates of P with eigenvalue +1, and the negative energy rest-frame 
spinors are eigenstates of Ê with eigenvalue —1. Such rest-frame eigenvalues 
of Ê are called intrinsic parities. The correspondence between negative energy 
solutions and antiparticles, discussed in the preceding section, then suggests 
that a fermion and its antiparticle have opposite intrinsic parity (note that 
the parity eigenvalue is multiplicative). We shall be able to derive this result 
after quantization of the Dirac field, in chapter 7. 

As usual in quantum mechanics, we may consider the action of Ê on oper- 
ators as well as wavefunctions. In particular, the parity transform of a Dirac 
Hamiltonian H(a) will be 


PÊ (x)Êt = bÊ Â (x) ÊA. (4.74) 


If the Hamiltonian is invariant under parity, the right hand side of (4.74) will 
equal H and the operator P will commute with A ; the eigenvalue of P will 
then be conserved. The reader may easily check that the Hamiltonian for the 
charged particle in a field A” is parity invariant, using P, AP}, =—-A. 

With the rule (4.68) in hand, we can examine how various bilinear covari- 
ants, such as wy or pyth, transform under parity. For example, 


wp (a, t)yp(a’, t) a y(x, t)bBBY(a, t) = y(x, t)p(a, t), (4.75) 
showing that wy is a scalar. Similarly, for a 4-vector 


vu" (a, t) = (vP (x,t), v(x, t)) = y(x, thy“ (a, t), (4.76) 
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the reader may check in problem 4.4(a) that v? is a scalar and v is a polar 
vector. 
More interesting possibilities emerge when we introduce a new y-matrix, 
5, defined by 
ys = i'n. (4.77) 
This matrix has the defining property that it anticommutes with the y” ma- 
trices: 


{75,7"} = 0. (4.78) 


Consider now the quantity p(æ, t) = U(a,t)75v(a,t). We find 


dp (x', t)ysvp(a", t) = ot (x, t)By56y(x, t) = —(a, t)p(æ, t), (4.79) 


so that p(a,t) is a pseudoscalar. Similarly, the reader may verify in problem 
4.4(b) that the quantity a” (æ, t) = W(a,t)ysy"w(a, t) transforms under (in- 
finitesimal) rotations and boosts as a 4-vector, but that under parity a? (æ, t) 
is a pseudoscalar and a(æ,t) is an axial vector. 

Matrix elements formed from v” and a” would have to be Lorentz invari- 
ant, of the form v„v”, apa”, or v,a". For the first of these, we find (shortening 
the notation) 


UPpUp = vv? — (—v)- (—v) = u,v", (4.80) 


and similarly ap af = a a”. Thus both of these matrix elements are scalars, 
taking the same form in both systems. However, this is not true of v,a”: 


vppah = v’(—a°) — (—v) - (a) = —v,0", (4.81) 


showing that this quantity is a pseudoscalar, changing sign when we change 
systems. By itself, such a sign change would be irrelevant, since observables 
will depend on the modulus squared of the matrix element. If, however, the 
matrix element for a process has the form (v, — a,)(v" — a”), for example, 
where both scalar and pseudoscalar parts are present, then the physics in one 
coordinate system and in the parity-transformed system will not be the same. 
One says ‘parity is violated’: only one of the systems can represent the real 
world; parity is conserved if physics in the two coordinate systems is the same. 

Lee and Yang (1956) were the first to point out that, while there was strong 
evidence for parity conservation in strong and electromagnetic interactions, its 
status in weak interactions was at that time untested. They proposed that a 
clear signal of parity violation could be found in weak decays from initially 
polarized states (i.e. < s >Æ 0): if the distribution of final state particles 
depends on odd powers of the cosine of the angle between the initial spin 
direction and the final momentum, then parity is violated (note that < s >-p 
is a pseudoscalar). The first experiment to demonstrate parity violation was 
performed by Wu et al. (1957), using the 6-decay of polarized ®°Co. Lee and 
Yang (1956) also remarked that parity violation in the decay 


at —> pt +vy (4.82) 
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implies that the spin of the muon will be polarized along the direction of its 
momentum, and furthermore that the angular distribution of positrons in the 
subsequent decay 


ut set + Dp + ve (4.83) 


would (as in the ©°Co experiment) serve as an analyser. This suggestion 
was quickly confirmed by Garwin et al. (1957) and by Friedman and Telegdi 
(1957); in the rest frame of the pion, the u™ spin is aligned opposite to its 
momentum, a situation that would be reversed in the parity transformed 
frame. 

The end result of many years of research was to establish that the currents 
responsible for weak interactions of quarks and leptons have precisely the 
‘vy! — a! structure, leading to the observed parity violation (see volume 2). 


4.2.2 Charge conjugation 


Dirac’s hole theory led him to the remarkable prediction of the positron, and 
suggested a new kind of symmetry: to each charged spin-1/2 particle there 
must correspond an antiparticle with the opposite charge and the same mass. 
Feynman’s interpretation of the negative energy solutions of the KG and Dirac 
equations assumes that this symmetry holds for both bosons and fermions. 
We now explore the idea of particle-antiparticle symmetry more formally. 

We begin with the KG equation for a spin-0 particle of mass m and charge 
q in an electromagnetic field A“, namely equation (4.1). Inspection of this 
equation shows at once that the wave function ¢c of a particle with the same 
mass and charge —q is related to the original wavefunction ¢ by 


bc = Nc? (4.84) 


where ņc is an arbitrary phase factor which we shall take to be unity. Equation 
(4.84) tells us how to connect the solutions of the particle (charge q) and 
antiparticle (charge —q) equations. When applied to free-particle solutions of 
the KG equation, the transformation (4.84) relates positive and negative 4- 
momentum solutions, as expected in the Feynman interpretation of the latter. 
We may extend the transformation (4.84) to a symmetry operation for the 
KG equation (4.1) if we introduce an operation which changes the sign of A”. 
Then the combined operation ‘take the complex conjugate of ¢ and change A“ 
to — AF’ is a formal symmetry of (4.84), in the sense that the wavefunction ¢* 
in the field —A” satisfies exactly the same equation as does the wavefunction 
@ in the field A“. Of course, we have just seen that ¢* is the antiparticle 
wavefunction, so it is no surprise that the dynamics of the antiparticle in 
a field —A* is the same as that of the particle in a field A”. Still, this is 
symmetry of the KG equation, which we will call charge conjugation, denoted 

by C: 
C:¢3¢c=¢@, A" > Ab =A". (4.85) 
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We can ask: how does the electromagnetic current behave under this trans- 
formation? The expression for the KG current is found by multiplying the 
free-particle probability current by the charge q, and by replacing ô” by the 
gauge-invariant operator D” = ð” + iqA". This leads to 


Jka em(? A") = ig{o"(0" + ig A") o — [(O" + igA") 4)" e} 


= iq "o — (0% ¢*) 4] — 20 A" G“ Q. (4.86) 
The current for dc, A% is then 
Jka em(¢e, AG) = iqta" go — (0"9G)¢ce] -20 AG bE¢c 
= igld d*d* — (0"9)d*] +20 A'e o" 
= -jka m At), (4.87) 


As we would hope, the KG current changes sign under C. 
Now consider the Dirac equation for a particle of mass m and charge q in 
a field A“, which we write in the form 


Op 
Ot 
We want to relate solutions of this equation to the solution we of the same 


equation with q replaced by —q. As in the KG case, we begin by writing down 
the complex conjugate equation, 


ow* 
ot 


(-a: V +iqa- A — ibm — iqA°)w. (4.88) 


= (—a,0! + a20? <a a30° 
— iga, 0? + iqa2d? — iqa3d° +ißm +iqA°)y* (4.89) 


where we have used the fact that a ,, a3 and 8 are real and ag is pure imag- 
inary, which is the case in both the standard representation of the Dirac 
matrices, and the representation (3.40). Now imagine multiplying (4.89) from 
the left by a matrix c, with the properties that it commutes with a; and a3, 
but anticommutes with a2 and 3. Then (4.89) will become 
au ee ee 

c- = (—a : V — iqa - A — ibm + iqA”) ey (4.90) 
which is just (4.88) with q replaced by —q. So we may identify the charge- 
conjugate Dirac wavefunction as 


vc = nc cy" (4.91) 
where ņc is the usual arbitrary phase factor. The required c is 
c= fa = 7° (4.92) 


as the reader may easily verify. It is customary to choose nc = i, and so 
finally the connection between Yc and wv is 


pelz) = Coy“ (x), where Co = i9. (4.93) 
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Let us look at the effect of the transformation (4.93) on free-particle solu- 
tions of the Dirac equation. Referring to (3.73) we find that a positive energy 
spinor is transformed to 


. pr* 
uc(p, §) z (E En m)? iy? ( o*-p ge* 
E+m 
oP ; S* 
= B+ 1/2 Fim (—lo2¢ ) 4.94 
(E +m) cle 86 
where we have used 03 = —02, 0201 = —0102 and 0203 = —o302. The 


4-spinor (4.94) is a negative energy solution v(p,s) as in (3.82), identifying 
—ioo¢™ with xê. Accordingly we have shown that 


uc(p, s) = v(p, s). (4.95) 


Similarly, as the reader may check, 
ve(p, s) = i7°v" (p, s) = u(p, 8). (4.96) 


So from a positive energy free-particle spinor associated with 4-momentum p 
and spin s the transformation (4.93) produces a negative energy free-particle 
spinor associated with the same 4-momentum and spin, and vice versa: that 
is, u and v are charge-conjugate spinors. 

At this point we may wonder if it is possible to construct a self-conjugate 
4-spinor. Such a spinor would be appropriate for a fermionic particle which 
is the same as its antiparticle — that is, for a Majorana fermion, so named 
after Ettore Majorana who first raised this possibility (Majorana 1937). To 
pursue this idea, it is convenient to use the representation (3.40) for the Dirac 
matrices again, in order to keep track of the Lorentz transformation property 
of the Majorana spinor. Consider the 4-spinor 


WM = ( er ) : (4.97) 
Then 


Sie eee ( d —ioz ) ( A ) s ( a ) Lege. WAGs) 


so that indeed wy, is self-conjugate. The Lorentz transformation property 
of wm is consistent, since we may easily show (problem 4.4(c)) that the 2- 
spinor o2¢* transforms as a x-type spinor. The reader can construct a similar 
self-conjugate 4-spinor using x rather than @. 

A self-conjugate fermion has to carry no distinguishing quantum number, 
such as electromagnetic charge. The only known neutral fermions are the neu- 
trinos, and until quite recently it was assumed that they are Dirac fermions, 
with distinct antiparticles (the relevant distinguishing quantum number being 
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lepton number). However, as we shall see in volume 2, owing to their very 
small mass, it is hard to discriminate between the two possibilities (Majorana 
and Dirac) for neutrinos, and a definitive answer will have to await the result 
of a crucial experiment, the search for neutrinoless double beta decay, which 
is only possible for Majorana neutrinos. 

Returning to more conventional matters, we extend (as in the KG case) 
the transformation (4.93) to a formal symmetry of the Dirac equation by 
including the sign change of A“, so that C for the Dirac equation is 


Cibo te =in’y*, AY > A". (4.99) 


We now examine how the electromagnetic current behaves under C in the 
Dirac case. The Dirac charge density is the probability density ty multiplied 
by the charge q, and the electromagnetic 3-current is the probability current 
tayp multiplied by q: 


Jb om = (0d, ap ad) = aby". (4.100) 


Consider the charge density: under the transformation (4.93) this becomes 


qe = qty ty" = qabba = qT". (4.101) 


In terms of the four components of 7, the product WTy* is yiy* + Wows + 
v3; + waz. These components are ordinary functions which commute with 
each other, so wT y* = y*Ty = Wty; hence 


gyve = avy (4.102) 


and the charge density does not change sign under C. Similarly, one finds that 
the electromagnetic 3-current does not change sign either. 

These results can be interpreted in the hole theory picture: the current 
due to a physical positive energy antiparticle of charge q and momentum p is 
regarded as the same as that of a missing negative energy particle of charge 
—q and momentum p. Our charge conjugation operation explicitly constructs 
the positive energy antiparticle wavefunction from the negative energy particle 
one. 

Yet this is not really what we want a true charge conjugation operator to 
do: which is, rather, to change a positive energy particle into a positive energy 
antiparticle. The same inadequacy was true in the KG case also. There is 
no way of representing such an operation in a single particle wavefunction 
formalism. The appropriate formalism is quantum field theory, in which y(x) 
becomes a quantum field operator (as do bosonic fields), and there is a unitary 
quantum field operator C with the required property. We shall see in chapter 
7 that fermionic operators anticommute with each other, and that this is just 
what is needed to ensure that the current changes sign under C. Bosonic 
fields, on the other hand, obey commutation rather than anticommutation 
relations, and this safeguards the change in sign of the bosonic current. 


4.2. Discrete transformations: P, C and T 103 


We have approached charge conjugation following the historical route, 
which is to say via the electromagnetic interaction. But we can ask whether 
(true) C is a good symmetry of other interactions, for example the weak 
interaction. Consider applying C to the reaction (4.82), so that it becomes 


TT > pu +p. (4.103) 


If C was a good symmetry, the (parity-violating) longitudinal polarization 
of the w~ in (4.103) should be the same as that of the u* in (4.82). But 
in fact it is the opposite, the u~ spin being aligned along the direction of 
its momentum. So C, like P, is violated in weak interactions. It is a good 
symmetry in electromagnetic and strong interactions. 


4.2.3 CP 


It has probably occurred to the reader that, although C and P are each 
violated in the decays (4.82) and (4.103), the combined transformation CP 
might be a good symmetry: particles are changed to antiparticles, the sense 
of longitudinal polarization is reversed, and the corresponding decays occur. 
Indeed, the rates for these two decays are the same, and CP is conserved. 
For a while, after 1956, it was hoped that CP would prove to be always 
conserved, so as to avoid a ‘lopsided’ distinction between right and left, and 
between matter and antimatter. But before long Christenson et al. (1964) 
reported evidence for CP violation in the decays of neutral K-mesons, a result 
soon confirmed by other experiments. 

As we mentioned in section 1.2.2, it was the difficulty of incorporating CP 
violation into the 2-generation electroweak theory that led Kobayashi and 
Maskawa (1973) to propose a third generation of quarks, which allowed a CP 
violating parameter to be included quite naturally. CP violation in K-decays 
is a small effect (of order one part in 10°), but in 1980 Carter and Sanda (1980) 
showed that considerably larger effects, up to 20%, could be expected in rare 
decays of neutral B mesons, according to the framework of Kobayashi and 
Maskawa (KM). Some 20 years later, the ‘B factories’ at the asymmetric e~ et 
colliders PEPII and KEKB began producing B mesons by the many millions, 
and intensive study of CP violation in the B°(db) — B? (db) systems followed 
at the BaBar and Belle detectors. Remarkably, all observations to date are 
consistent with the original KM parametrization. We shall return to this 
topic when we discuss weak interactions in volume 2, specifically in chapter 
21. Meanwhile we refer to Bettini (2008), chapter 8, for an introductory 
overview. 

It is worth pausing here to note the significance of CP violation. First 
of all, it implies that there is an absolute distinction between matter and 
antimatter and, as a consequence, between left and right: these are not merely 
a matter of convention. For example, the rate for the process 


B? > Kr” (4.104) 
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is some 20% greater (Nakamura et al. 2010) than the rate for the CP- 
conjugate process 

B? > Kort. (4.105) 
(Note that the B°? state is conventionally defined as the CP transform of the 
B® state). So the pion distinguished by being emitted in the higher-yielding 
reaction (4.104) defines ‘negatively charged’, and the polarization of the muon 
in its decay (4.103) defines what is a right-handed screw sense. 

Secondly, CP (and C) violation is one of the three conditions? established 
by Sakharov (1967) that would enable a universe containing initially equal 
amounts of matter and antimatter, when created in the Big Bang, to evolve 
into the matter-dominated universe we see today — rather than simply having 
the required imbalance as an initial condition. Within the Standard Model, 
all known CP violating effects are attributable to the KM mechanism. But 
calculations show (Huet and Sather 1995) that the matter-antimatter asym- 
metry generated from this source is very many orders of magnitude too small. 
This is, therefore, one area of physics where the Standard Model fails. 

Thirdly, CP violation is directly connected to the violation of another 
discrete symmetry, namely time reversal T, because very general principles of 
quantum field theory imply that the product CPT (in any order) is conserved 
-the CPT theorem. This theorem states (Liiders 1954, 1957, Pauli 1957) that 
CPT must be an exact symmetry for any Lorentz invariant quantum field 
theory constructed out of local fields, with a Hermitian Hamiltonian, and 
quantized according to the usual spin-statistics rule (integer spin particles are 
bosons, half-odd integer spin particles are fermions). Thus any violation of 
CP implies a violation of T if CPT is to be conserved. 

We shall return to CPT presently, but first let us deal with T. 


4.2.4 Time reversal 
The time reversal transformation T is defined by 
T:x£x>x' =x, tot =-t; (4.106) 


that is, T reverses the direction of time. It follows that T reverses momenta 
(p > —p) and angular momenta (a x p > —a x p). Let us also note how 
the electromagnetic potentials transform under T: A? does not change, being 
generated by static charges, while A changes sign, since it is produced by 
currents; that is, 


ASE) = A°(t) Arlt) = —A(Z). (4.107) 


It follows that the electric field Æ does not change sign under T, but the 
magnetic field B does. It is easily checked that these prescriptions ensure 
that the Maxwell equations are covariant under T. 


2The other two are (a) the existence of baryon number violating transitions and (b) a 
time when the C, CP and baryon number violating transitions proceeded out of thermal 
equilibrium. 
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Consider first the behaviour of the KG equation for a particle of charge q 
in the field At: 


(+ m*)G(t) = —igld, A" (t) + A" (taulo) +A Hl). (4.108) 


The equation in the time-reversed system is 


(A +m*)or(t') = ig AT) + ATEO or) +P ATOT). (4.109) 


Using (4.107) we obtain 
O A(t) = —0, A" (t), A(t’), = —A"(t)0,, A(t’) = A?(t). (4.110) 
It follows that we can identify 
or(t') = *(t) (4.111) 


up to an arbitrary phase factor, here chosen to be unity. If ¢ is a positive- 
energy free particle solution, é* represents a particle of positive energy in the 
time-reversed system, with momentum —p as expected. 

Now consider the behaviour under T of the Dirac equation for a particle 
of charge q in a field A”, 


a(t) _ 

ot 
where we have suppressed the spatial coordinate arguments. In the time- 
reversed system, the corresponding equation is 


i Owr(t’) 
ov 


{a-[-iV — qA(t)] + Bm + qA°(t)} v(t) (4.112) 


i 


= {a-[-iV — gAr(t’)] + Bm + qgAT Jhr (t). (4.113) 


To relate wr to w we start by taking the complex conjugate of (4.112) so as 
to obtain 


ov 
ot 


= {a*- [iV — qA(t)] + B*m + GAO) jy (t) (4.114) 
which we may rewrite as 


ovr) 
ot! 


= {a*- [iV + qAr(t’)] + B*m + gAR(t’) }0*(t). (4.115) 
Now suppose a unitary matrix Ur exists such that 
Ura*U} = -a, Un6*ul = p; (4.116) 


then it is clear that the Dirac equation will be covariant under T with the 
identification 


Yrlt) = Ury” (t). (4.117) 
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In either of the two representations of the Dirac matrices which we have been 
using, @1,q@3 and 8 are real, while a2 is pure imaginary; it follows that Ur 
must commute with a2 and 8, and anticommute with a; and ag. A suitable 
Ur is 

Ur = 1a1a3 (4.118) 
where the phase is a conventional choice. 


Let us check what is the effect of the transformation (4.117) on a positive- 
energy plane wave solution (3.74). In the representation (3.31) Ur is given 


Up = ( T 0 ) (4.119) 


02 


02 E+m 


prt) = Erh 2 ) ( 2b. ) soins) 


= (E+m)? Pig , Jexp(-iEt’+ip’-a), (4.120) 
Bim 22? 


which is a positive-energy solution with the expected momentum p' = —p, 
and with the transformed spinor wavefunction o2¢*. If we take ¢ to be a 
helicity eigenstate 

o-p 


|p] 


where A = +1, then it follows that 


a = Ady (4.121) 


——020) = O20), (4.122) 


and the helicity is unchanged. . 
As in the case of parity, we may introduce an operator T which changes 
@ to dr for the KG equation, and w to Yr for the Dirac equation. Then 


T(KG) = KT (4.123) 


and ; i 
T(Dirac) = UrKTo (4.124) 


where K is the complex conjugation operator, and To is the time coordinate 
reversal operator. The appearance of K is a general feature of time-reversal 
in quantum mechanics (Wigner 1964), and has important consequences.? Be- 
cause the transformations involve complex conjugation, the scalar product of 


3Complex conjugation also appeared in our discussion of C in section 4.2.2, but as 
indicated there the true operator C of quantum field is unitary. Even in quantum field 
theory, however, the time-reversal operator involves complex conjugation, as we shall see in 
section 7.5.3. 
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two wavefunctions < ~|t,; > is not equal to the corresponding quantity 
< wWor|Vvir >, as it would be in the case of parity, for example, or for any 
other transformation represented by a unitary operator. Instead, we have 


< dali >=< Yor|yir >*. (4.125) 


Note, however, that the probability | < Y2|¢%1 > |? is still preserved. 
If we consider the matrix element of any operator Ô, then since Ou, is 
itself a wavefunction, we must have 


< W2|Oldi >=< y2|Ov1 >=< Yor|TO1 >*=< por PÔ! Yir >* 
(4.126) 
where TOT~! is the operator in the time-reversed system. In particular, if 
we take O to be a Hermitian interaction potential V, which is time-reversal 
invariant, then time-reversal invariance implies the relation 


< pV id >=< dor|V dir >*=< pir V opr > . (4.127) 


Now < w2|V |W, > is the amplitude for the state represented by Yı to make a 
transition to the state represented by 72 to first order in the potential V (see 
section M.3 of appendix M). Equation (4.127) therefore relates this amplitude 
to one for the inverse transition, involving time-reversed states. The relation in 
fact holds for the complete (all orders) transition operator T (see for example 
Lee 1981, section 13.5), and enables one to relate rates and cross sections for 
reactions and their inverses. 

For strong interactions, these relations are straightforward to test, and 
confirm that strong interactions are T-invariant. So are electromagnetic inter- 
actions. In weak interactions, where the violation of CP and the conservation 
of CPT implies that T is violated, it is generally very difficult if not impos- 
sible to set up the conditions for an inverse reaction to occur (consider the 
inverse of neutron decay, n + pe~ De, for example). However, one such test is 
possible in neutral K-decays (Kabir 1970). We can check whether the rate for 
a particle tagged at its production as a K? to decay in a way that identifies 
it as a K? is equal to the rate for a particle tagged as K? at its production 
to decay in a way that identifies it as a K°. The experiment (Angelopoulos 
et al. 1998) showed a T-violating difference in these rates. The parame- 
ters determining these reactions had actually been well determined by other 
measurements; still, this was an independent and direct demonstration of T 
violation. Evidence for T violation in B-meson transitions has been reported 
by Alvarez and Szynkman (2008), developing a test suggested by Banuls and 
Bernabeu (1999, 2000). 

We can also examine the behaviour of various bilinears under T. For ex- 
ample, the reader may easily check the results 


br(2')br(2’) =da)v(z),  de(2')osbr(2’) = —v(z)5(x). (4.128) 


Time reversal symmetry will be violated if the theory contains both even and 
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odd amplitudes under T. An interesting example is provided by the amplitude 


—idep(x)o"” yb () Fur, (4.129) 
where f 
g” = aed ayy) (4.130) 


and where F),, is an external electric field with non-vanishing components 
Fo; = E’. In the representation (3.31) 


? 


i . i 0 ‘ 
oy =i ( i k ) = iX; (4.131) 
and (4.129) reduces to g 
detp(x) E(x) - E. (4.132) 


Problem 4.5 shows that the quantity (4.132) is odd under T, and it is easy 
to check that it is also odd under P. A non-zero value of such a term would 
correspond to an electric dipole moment for a spin-1/2 particle (compare the 
analogous quantity dmŲỌ(x)&y(x)- B for the magnetic dipole moment, which 
is even under P and T). Experiment places very strong limits on possible 
electric dipole moments (Nakamura et al. 2010) for the neutron, proton and 
electron: 


dy < 0.29 x 1077 e cm (4.133) 
p < 0.54x 107” ecm (4.134) 
de = (0.069 0.074) x 10~7° e cm (4.135) 


Although these numbers seem tiny, calculations of the d, in the Standard 
Model produce a result some 6 or 7 orders of magnitude smaller than (4.133). 
However, these experimental limits impose strong constraints on theories 
which go beyond the Standard Model, and which may typically contain the 
possibility of larger T and CP violating effects. 


4.2.5 CPT 


We denote the product CPT by 0, and the corresponding operator by Ô. As 
already mentioned, for any conventional quantum field theory, and certainly 
for the Standard Model, the transformation @ is an invariance of the theory. 
One immediate consequence of this invariance is the equality of particle and 
antiparticle masses. This is easily demonstrated. Let |X, s+ > be the state of 
a particle X at rest with z-component of spin equal to sz. The mass of X is 
given by the expectation value 


Mx =< X,s.|H|X, s; >, (4.136) 


where H is the total Hamiltonian. Clearly Mx is real, and independent of 
sz. Now the operator @ involves T, and therefore we must be careful to use 
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(4.126) rather than the usual rule for unitary operators. So from (4.126) we 
have 


Mx =< X,s,|H|X,s. >*=< X,s,|0 646 6|X,s, > (4.137) 


If the Hamiltonian is CPT invariant, then 6H 6 = H. Also, we know 
the action of P,C and T on the states, from the previous results. Equation 
(4.137) then becomes 


Mx =< X, —s,|H|X, —s, >= Mg, (4.138) 


stating the equality of particle and antiparticle masses. The most sensitive 
test of (4.138) is provided by the K? — K? system, where the currently quoted 
limit for the mass difference is (Nakamura et al. 2010) 


|Me — M3 


1071° at L. 4.1 
i. <8x 10 at 90% C (4.139) 


@-invariance also implies that the charges of a charged particle and its 
antiparticle are equal in magnitude but opposite in sign, as are their magnetic 
moments; and in the case of unstable particles it implies that their lifetimes 
are equal, to first order in the interaction responsible for the decay (Lee 1981). 
All current data support these equalities (Nakamura et al. 2010). Other tests 
involve analysis of the implications of -invariance as applied to transition 
amplitudes. As an example, we refer to a recent analysis of K-decays by 
Abouziad et al. (2011), both with and without the assumption of 6-invariance. 
The results were consistent with @-invariance. 


i 
Problems 
4.1 Consider an infinitesimal boost along the x-axis, 
t = t—ne (4.140) 
ve = t-n. (4.141) 
Show that the KG wavefunction transforms according to 
g' (2,4) = (1+ ink), (4.142) 


where 


K, = —i x ð/ðt — i t 0/da. (4.143) 

Defining similar operators Ky, Š, for boosts in the y and z directions, show 
that ae . 

Kz, Ky] = —iL,. (4.144) 
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4.2 In this problem, use the representation (3.40) for the Dirac matrices, as 
in section 4.1.2. 


(a) Using the rule (4.19) for the transformation of the spinor ¢ under 
an infinitesimal rotation of the coordinate system, verify that d'a¢ 
transforms as a 3-vector. [Hint: you need to show that tog’ = 
¢'o¢—e€x tog; use the results of problem 3.4(a).] Show also that 
the free-particle Dirac probability current density is a 3-vector. 

(b) Using the rule (4.42) for the transformation of @ and x under an 
infinitesimal boost, verify that j = ¢'a@ — y'oy transforms as the 
3-vector part of the 4-vector (p, j). [Hint: you need to show that 
j =j-np| 


(a) Defining the four ‘y matrices’ 


y = (3,7) 


where y° = 6 and y = Ba, show that the Dirac equation can 
be written in the form (iy“0, — m)ẹy = 0. Find the anticommu- 
tation relations of the y matrices. Show that the positive energy 
spinors u(p,s) satisfy (p — m)u(p,s) = 0, and that the negative 
energy spinors v(p, s) satisfy (p + m)u(p,s) = 0, where p = "pu 
(pronounced ‘p-slash’). 
(b) Define the conjugate spinor 
B(x) = yi («)° 

and use the previous result to find the equation satisfied by Y% in y 
matrix notation. 


(c) The Dirac probability current may be written as 


j" = p(z)" y(a). 
Show that it satisfies the conservation law 


ð j” = 0. 


(a) Verify that, under P, 7)(a, t)y°w(a, t) is a scalar, and that w(x, t)yy(æ, t) 
is a polar vector. 


(b) Verify that a” (æ, t) = Y(æ, t)ysy y (æ, t) transforms under infinites- 
imal rotations and boosts as a 4-vector; and that under P a? (æ) is 
a pseudoscalar, and a(x, t) is an axial vector. 


(c) Show that o2¢* transforms under rotations and boosts as a x-type 
spinor, and that o2x*“ transforms as a ¢-type spinor. 


4.5 Verify that (a, t)&y(æ, t) - E of (4.132) is odd under T. 
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4.6 The Galilean transformation (non-relativistic boost) is defined by 
=p, t=t. 


Show that the free-particle time-dependent Schrödinger equation is covariant 
under this transformation if the wavefunction transforms according to the rule 
wy (a’, t) = explif (a, t)]w(a, t), where f(x, t) satisfies the condition 

Of 1 i i 

——-v-Vft+tiv- V = — (Vf? -— V’ f -Vf V. 

Ot Fri Imi P) 2m f m f 
Find constants a and b such that the function f = at + b - æ satisfies this 
condition. Show that the resulting transformation rule is consistent with the 
way you expect a plane wave solution to transform. 
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Introduction to Quantum 
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It was a wonderful world my father told me about. 

You might wonder what he got out of it all. I went to MIT. I went to 
Princeton. I went home and he said, ‘Now you’ve got a science education. I 
have always wanted to know something that I have never understood; and so, 
my son, I want you to explain it to me.’ I said yes. 

He said, ‘I understand that they say that light is emitted from an atom 
when it goes from one state to another, from an excited state to a state of 
lower energy.’ 

I said ‘That’s right.’ 

‘And light is a kind of particle, a photon I think they call it.’ 

‘Yes.’ 

‘So if the photon comes out of the atom when it goes from the excited to 
the lower state, the photon must have been in the atom in the excited state.’ 

I said, ‘Well, no.’ 

He said, ‘Well, how do you look at it so you can think of a particle photon 
coming out without it having been in there in the excited state?’ 

I thought a few minutes, and I said, ‘’'m sorry; I don’t know. I can’t 
explain it to you.’ 

He was very disappointed after all these years and years trying to teach 
me something, that it came out with such poor results. 


—R. P. Feynman, The Physics Teacher, vol 7, No 6, September 1969 


All the fifty years of conscious brooding have brought me no closer to the 
answer to the question, ‘What are light quanta?’ Of course today every rascal 
thinks he knows the answer, but he is deluding himself. 


—A. Einstein (1951) 


Quoted in ‘Einstein’s research on the nature of light’ 
E. Wolf (1979), Optic News, vol 5, No 1, page 39. 


I never satisfy myself until I can make a mechanical model of a thing. If I can 
make a mechanical model I can understand it. As long as I cannot make a 
mechanical model all the way through I cannot understand; and that is why 
I cannot get the electromagnetic theory. 


—Sir William Thomson, Lord Kelvin, 1884 Notes of Lectures on Molecular 
Dynamics and the Wave Theory of Light delivered at the Johns Hopkins Uni- 
versity, Baltimore, stenographic report by A. S. Hathaway (Baltimore: Johns 
Hopkins University) Lecture XX, pp 270-1. 
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Quantum Field Theory 1: The Free Scalar 
Field 


In this chapter we shall give an elementary introduction to quantum field 
theory, which is the established ‘language’ of the Standard Model of particle 
physics. Even so long after Maxwell’s theory of the (classical) electromagnetic 
field, the concept of a ‘disembodied’ field is not an easy one; and we are 
going to have to add the complications of quantum mechanics to it. In such a 
situation, it is helpful to have some physical model in mind. For most of us, as 
for Lord Kelvin, this still means a mechanical model. Thus in the following two 
sections we begin by considering a mechanical model for a quantum field. At 
the end, we shall — like Maxwell — throw away the ‘mechanism’ and have simply 
quantum field theory. Section 5.1 describes this programme qualitatively; 
section 5.2 presents a more complete formalism, for the simple case of a field 
whose quanta are massless, and move in only one spatial dimension. The 
appropriate generalizations for massive quanta in three dimensions are given 
in section 5.3. 


E: SSe 


5.1 The quantum field: (i) descriptive 


Mechanical systems are usefully characterized by the number of degrees of 
freedom they possess: thus a one-dimensional pendulum has one degree of 
freedom, two coupled one-dimensional pendulums have two degrees of free- 
dom — which may be taken to be their angular displacements, for example. A 
scalar field (x,t) corresponds to a system with an infinite number of degrees 
of freedom, since at each continuously varying point x an independent ‘dis- 
placement’ ¢(a,t), which also varies with time, has to be determined. Thus 
quantum field theory involves two major mathematical steps: the description 
of continuous systems (fields) which have infinitely many degrees of freedom, 
and the application of quantum theory to such systems. These two aspects are 
clearly separable. It is certainly easier to begin by considering systems with 
a discrete — but possibly very large — number of degrees of freedom, for ex- 
ample a solid. We shall treat such systems first classically and then quantum 
mechanically. Then, returning to the classical case, we shall allow the number 
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(a) () 


FIGURE 5.1 
A vibrating system with two degrees of freedom: (a) two mass points at rest, 
with the strings under tension; (b) a small transverse displacement. 


of degrees of freedom to become infinite, so that the system corresponds to a 
classical field. Finally, we shall apply quantum mechanics directly to fields. 
We begin by considering a rather small solid — one that has only two atoms 
free to move. The atoms, each of mass m, are connected by a string, and each 
is connected to a fixed support by a similar string (figure 5.1(a@)); all the 
strings are under tension F. We consider small transverse vibrations of the 
atoms (figure 5.1(b)), and we call qg,(t) (r = 1, 2) the transverse displacements. 
We are interested in the total energy E of the system. According to classi- 
cal mechanics, this is equal to the sum of the kinetic energies img? of each 
atom, together with a potential energy V which can be calculated as follows. 
Referring to figure 5.1(b), when atom 1 is displaced by qi, it experiences a 
restoring force 
F, = Fsina— F'sin B (5.1) 


assuming a constant tension F along the string. For small displacements qı 
and qə (i.e. q1,2 < l) we have 


sina = q /(? + ge) qi /l 
sin B = (q2 — q1)/ [P + (q2 — q1)?! ~ (q2 — q )/l 


where terms of order (q1,2/1)3 and higher have been neglected. Thus the 
restoring force on particle 1 is, in this approximation, 


(5.2) 


Fı = k(2qı — q2) (5.3) 
with k = F/l. Similarly, the restoring force on particle 2 is 
F> = k(2q2 — qı) (5.4) 
and the equations of motion are 
ma = —k(2q1 — Q) (5.5) 


mgz = —k(2q2—- qı). (5.6) 
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The potential energy is then determined (up to an irrelevant constant) by the 
requirement that (5.5) and (5.6) are of the form 


mq = —0V/0q (5.7) 
me = —OV/0q2. (5.8) 

Thus we deduce that 
V = k(q? +43 — 1142). (5.9) 


Equations (5.5) and (5.6) form a pair of linear, coupled differential equa- 
tions. Each of the italicized words is important. By ‘linear’, is meant that only 
the first power of qı and q2 and their time derivatives appear in the equations 
of motion; terms such as q?, q1q2, ġ?, gq? and so on would render the equa- 
tions of motion ‘nonlinear’. This linear/nonlinear distinction is a crucial one 
in dynamics. Most importantly, the solutions of linear differential equations 
may be added together with constant coefficients (‘linearly superposed’) to 
make new valid solutions of the equations. In contrast, solutions of nonlinear 
differential equations — besides being very hard to find! — cannot be linearly 
superposed to get new solutions. In addition, nonlinear dynamical equations 
may typically lead to chaotic motion. 

The notion of linearity /nonlinearity carries over also into the equations of 
motion for fields. In this context, an equation for a field ¢(a,t) is said to be 
linear if @ and its space — or time — derivatives appear only to the first power. 
As we shall see, this is true for Maxwell’s equations for the electromagnetic 
field and it is, of course, the mathematical reason behind all the physics of such 
things as interference and diffraction, which may be understood precisely in 
terms of superposition of solutions of these equations. Likewise the equations 
of quantum mechanics (e.g. Schr6dinger’s equation) are all linear in this sense, 
consistent with the principle of superposition in quantum mechanics. 

It is clear, then, that in looking at simple mechanical models as a guide 
to the field systems in which we will ultimately be interested, we should con- 
sider ones in which the equations of motion are linear. In the present case, 
this is true, but only because we have made the approximation that qı and 
q2 are small (compared to l). Referring to equation (5.2), we can imme- 
diately see that if we had kept the full expression for sina and sin 8, the 
resulting equations of motion would have been highly nonlinear. A similar 
‘small displacement’ approximation has to be made in determining the famil- 
iar wave equation, describing waves on continuous strings, for example (see 
(5.29) later). Most significantly, however, quantum mechanics is believed to 
be a linear theory without any approximation. 

The appearance of only linear terms in qı and qz in the equations of mo- 
tion implies, via (5.7) and (5.8), that the potential energy can only involve 
quadratic powers of the q’s, i.e. q?, q3 and qiqe, as in (5.9). Once again, had 
we used the general expression for the potential energy in a stretched string 
as ‘tensionx extension’ we would have obtained an expression containing all 
powers of the q’s via such terms as {[I? + q?]!/? — 1}. 
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We turn now to the coupled aspect of (5.5) and (5.6). By this we mean 
that the right-hand side of the qı equation depends on q2 as well as q1, and 
similarly for the q2 equation. This ‘mathematical’ coupling has its origin in 
the term —kqiq2 in V, which corresponds to the ‘physical’ coupling of the 
string BC connecting the two atoms. If this coupling were absent, equa- 
tions (5.5) and (5.6) would describe two independent (uncoupled) harmonic 
oscillators, each of frequency (2k/m)!/?. When we consider the addition of 
more and more particles (see later) we certainly do not want them to vibrate 
independently, otherwise we would not be able to get wave-like displacements 
propagating through the system. So we need to retain at least this minimal 
kind of ‘quadratic’ coupling. 

With the coupling, the solutions of (5.5) and (5.6) are not quite so obvious. 
However, a simple step makes the equations much easier. Suppose we add the 
two equations so as to obtain 


m(qi + Ge) = —k(q + q2) (5.10) 
and subtract them to obtain 


A remarkable thing has happened: the two combinations qı + q2 and qi — Q2 
of the original coordinates satisfy wncoupled equations — which are of course 
very easy to solve. The combination qı + q2 oscillates with frequency w1 = 
(k/m)'/?, while qı — q oscillates with frequency we = (3k/m)!/?. 

Let us introduce 


Qi=(ut+q@)/V2 Qo =( —@)/V2 (5.12) 


(the /2’s are for later convenience). Then the solutions of (5.10) and (5.11) 
are: 


Qi(t) = Acosw t+ Bsinw,t (5.13) 
Qt) = Ccoswet + Dsinwet. 


Suppose that the initial conditions are such that 
qi(0) = q2(0) =a qi (0) = qo(0) =0 (5.15) 


i.e. the atoms are released from rest, at equal transverse displacements a. In 
terms of the Q,’s, the conditions (5.15) are 


Q2(0) = Q2(0) = 0 
Qi(0)=V2a Qi (0) = 0. 


Thus from (5.13) and (5.14) we find that the complete solution, for these 
initial conditions, is 


(5.16) 


Qt) = V/2a cos wt 
Qa(t) = 0. 
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(a) (b) 


FIGURE 5.2 
Motion in the two normal modes: (a) frequency w1; (b) frequency we. 


We see from (5.18) that the motion is such that qı = q2 throughout, and from 
(5.17) that the system vibrates with a single definite frequency w1. A form 
of motion in which the system as a whole moves with a definite frequency 
is called a ‘normal mode’ or simply a ‘mode’ for short. Figure 5.2(a) shows 
two ‘snapshot’ configurations of our two-atom system when it is oscillating in 
the mode characterized by qı = q2. In this mode, only Qı (t) changes; Q2(t) 
is always zero. Another mode also exists in which qı = —q2 at all times: 
here Qı (t) is zero and Q(t) oscillates with frequency w2. Figure 5.2(b) shows 
two snapshots of the atoms when they are vibrating in this second mode. 
The coordinate combinations Q1, Q2, in terms of which this ‘single frequency 
motion’ occurs, are called ‘normal mode coordinates’ or ‘normal coordinates’ 
for short. 

In general, the initial conditions will not be such that the motion is a pure 
mode; both Qı (t) and Q2(t) will be non-zero. From (5.12) we have 


q(t) = [Qi(t) + Q2(t)]/Vv2 (5.19) 


and 
qa(t) = [Qi(t) — Q2(t)]/ V2 (5.20) 


so that qı and q2 are expressed as a sum of two terms oscillating with frequen- 
cies w and w2. We say the system is in ‘a superposition of modes’. Never- 
theless, the mode idea is still very important as regards the total energy of 
the system, as we shall now see. The kinetic energy can be written in terms 
of the mode coordinates Q, as 


T = imQ? + imQ? (5.21) 
while the potential energy V of (5.9) becomes 
V = {mwi Qi + imo? = V (Q1, Q2). (5.22) 


The total energy is therefore 


E= EmO? + imQ] + [5 smu Qi + 4mw3Q3]. (5.23) 
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This equation shows that, when written in terms of the normal coordinates, 
the total energy contains no couplings terms of the form @,Q2; indeed, the 
energy has the remarkable form of a simple sum of two independent uncoupled 
oscillators, one with characteristic frequency w1, the other with frequency w2. 
The energy (5.23) has exactly the form appropriate to a system of two non- 
interacting ‘things’, each executing simple harmonic motion: the ‘things’ are 
actually the two modes. Modes do not interact, whereas the original atoms do! 
Of course, this decoupling in the expression for the total energy is reflected in 
the decoupling of the equations of motion for the Q variables: 


z oV (Qi, Q2) 
mQr = 30, r= 1,2. (5.24) 
It is most important to realize that the modes are non-interacting by virtue 
of the fact that we ignored higher than quadratic terms in V (q1, q2). Although 
the simple change of variables (q1,q2) > (Q1, Q2) of (5.12) does remove the 
qıq2 coupling, this would not be the case if, say, cubic terms in V were to 
be considered. Such higher order ‘anharmonic’ corrections would produce 
couplings between the modes — indeed, this will be the basis of the quantum 
field theory description of particle interactions (see the following chapter)! 
The system under discussion had just two degrees of freedom. We began 
by describing it in terms of the obvious degree of freedom, the physical dis- 
placements of the two atoms qı and q2. But we have learned that it is very 
illuminating to describe it in terms of the normal coordinate combinations 
Qı and Q2. The normal coordinates are really the relevant degrees of free- 
dom. Of course, for just two particles, the choice between the qr’s and the 
Q,’s may seem rather academic; but the important point — and the reason 
for going through these simple manipulations in detail — is that the basic idea 
of the normal mode, and of normal coordinates, generalizes immediately to 
the much less trivial N-atom problem (and also to the field problem). For N 
atoms there are (for one-dimensional displacements) N degrees of freedom, 
and if we take them to be the actual atomic displacements, the total energy 
will be 


N 
E= m +Vv( (gipsi isir) (5.25) 


r=1 


which includes all the couplings between atoms. We assume, as before, that 
the q,’s are small enough so that only quadratic terms need to be kept in V (a 
constant is as usual irrelevant, and the linear terms vanish if the q,’s are the 
displacements from equilibrium). In this case, the equations of motion will be 
linear. By a linear transformation of the form (generalizing (5.12)) 


N 
Qr = 5 Arsis (5.26) 
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it is possible to write E as a sum of N separate terms, just as in (5.23): 


N 
B=) [imQ + $mw?Q?]. (5.27) 
r=1 
The Q,’s are the normal coordinates and the w,.’s are the normal frequencies, 
and there are N of them. If only one of the Q,’s is non-zero, the N atoms are 
moving in a single mode. The fact that the total energy in (5.27) is a sum of 
N single-mode energies allows us to say that our N-atom solid behaves as if 
it consisted of N separate and free harmonic oscillators — which, however, are 
not to be identified with the coordinates of the original atoms. Once again, 
and now much more crucially, it is the mode coordinates that are the relevant 
degrees of freedom rather than those of the original particles. 

The second stage in our programme is to treat such systems quantum 
mechanically, as we should certainly have to for a real solid. It is still true 
that — if the potential energy is a quadratic function of the displacements — 
the transformation (5.26) allows us to write the total energy as a sum of N 
mode energies, each of which has the form of a harmonic oscillator. Now, 
however, these oscillators obey the laws of quantum mechanics, so that each 
mode oscillator exists only in certain definite states, whose energy eigenvalues 
are quantized. For each mode of frequency wp, the allowed energy values are 

Er = (np + 4) hwy (5.28) 
where n, is a positive integer or zero. This is in sharp contrast to the classical 
case, of course, in which arbitrary values are allowed for the oscillator energies. 
The total energy eigenvalue then has the form 


N 
B= S (n, + $)huy. (5.29) 


The frequencies wy are determined by the interatomic forces and are common 
to both the classical and quantum descriptions; in quantum theory, though, 
the states of definite energy of the vibrating N-body system are characterized by 
the values of a set of integers (nı, n2,..., ny), which determine the energies 
of each mode oscillator. 

For each mode oscillator, hw, measures the quantum of vibrational energy; 
the energy of an allowed mode state is determined uniquely by the number ny 
of such quanta of energy in the state. We now make a profound reinterpre- 
tation of this result (first given, almost en passant by Born, Heisenberg and 
Jordan (Born et al. 1926) in one of the earliest papers on quantum mechan- 
ics). We forget about the original N degrees of freedom qi, q2,..-,qn and the 
original N ‘atoms’, which indeed are only remembered in (5.29) via the fact 
that there are N different mode frequencies wr. Instead we concentrate on 
the quanta and treat them as ‘things’ which really determine the behaviour 
of our quantum system. We say that ‘in a state with energy (n, + 4)hw, there 
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are n, quanta present’. For the state characterized by (n1, n2,..., ny) there 
are nı quanta of mode 1 (frequency w1), n2 of mode 2,... and ny of mode N. 
Note particularly that although the number of modes N is fixed, the values of 
the n,’s are unrestricted, except insofar as the total energy is fixed. Thus we 
are moving from a ‘fixed number’ picture (N degrees of freedom) to a ‘vari- 
able number’ picture (the n,’s restricted only by the total energy constraint 
(5.29)). In the case of a real solid, these quanta of vibrational energy are 
called phonons. We summarize the point we have reached by the important 
statement that a phonon is an elementary quantum of vibrational excitation. 

Now we take one step backward in order, afterwards, to take two steps 
forward. We return to the classical mechanical model with N harmonically 
interacting degrees of freedom. It is possible to imagine increasing the num- 
ber N to infinity, and decreasing the interatomic spacing a to zero, in such a 
way that the product Na stays finite, say Na = £. We then have a classical 
continuous system — for example a string of length Z. (We stay in one dimen- 
sion for simplicity.) The transverse vibrations of this string are now described 
by a field d(x, t), where at each point «x of the string d(x, t) measures the dis- 
placement from equilibrium, at the time t, of a small element of string around 
the point x. Thus we have passed from a system described by a discrete num- 
ber of degrees of freedom, qr(t) or Q,(t), to one described by a continuous 
degree of freedom, the displacement field ¢(a,t). The discrete suffix r has 
become the continuous argument x — and to prepare for later abstraction, we 
have denoted the displacement by (x,t) rather than, say, q(x, t). 

In the continuous problem the analogue of the small-displacement assump- 
tion, which limited the potential energy in the discrete case to quadratic pow- 
ers, implies that $(x,t) obeys the wave equation 

2 2 

= olz, t) = O* b(a, t) (5.30) 

e Ot Ox? 
where c is the wave propagation velocity. Note that (5.30) is linear, but 
only by virtue of having made the small-displacement assumption. Again, we 
consider first the classical treatment of this system. Our aim is to find, for 
this continuous field problem, the analogue of the normal coordinates — or in 
physical terms, the modes of vibration — which were so helpful in the discrete 
case. Fortunately, the string’s modes are very familiar. By imposing suit- 
able boundary conditions at each end of the string, we determine the allowed 
wavelengths of waves travelling along the string. Suppose, for simplicity, that 
the string is stretched between x = 0 and x = ¢. This constrains (2, t) to 
vanish at these end points. A suitable form for ¢(a,t) which does this is 


$,(x,t) = A,(t) sin (75) (5.31) 


where r = 1,2,3,..., which expresses the fact that an exact number of half- 
wavelengths must fit onto the interval (0, £). Inserting (5.31) into (5.30), we 
find 

A, = —wA, (5.32) 
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(a) 


FIGURE 5.3 
String motion in two normal modes: (a) r = 1 in equation (5.31); (b) r = 2. 


where 
w = rre e. (5.33) 


Thus the amplitude A,(t) of the particular waveform (5.31) executes simple 
harmonic motion with frequency wr. Each motion of the string which has a 
definite wavelength also has a definite frequency; it is therefore precisely a 
mode. Figure 5.3(a) shows two snapshots of the string when it is oscillating 
in the mode for which r = 1, and figure 5.3(b) shows the same for the mode 
r = 2; these may be compared with figures 5.2(a) and (b). Just as in the 
discrete case, the general motion of the string is a superposition of modes 


= Ant ) sin (+); (5.34) 


in short, a Fourier series! 

We must now examine the total energy of the vibrating string, which 
we expect to be greatly simplified by the use of the mode concept. The total 
energy is the continuous analogue of the discrete summation in (5.25), namely 


the integral 
1 o 
B= [ Et @) + spc (=) ] da (5.35) 


where the first term is the kinetic energy and the second is the potential 
energy (p is the mass per unit length of the string, assumed constant). As 
noted earlier, the potential energy term arises from an approximation which 
limits it to the quadratic power. To relate this to the earlier discrete case, 
note that the derivative may be regarded as [¢(x + ôx) — (x)]/ðx as da — 0, 
so that the square of the derivative involves the ‘nearest neighbour coupling’ 
(a + ôx)o(x), analogous to the q1q2 term in (5.9). 

Inserting (5.34) into (5.35), and using the orthonormality of the sine func- 
tions on the interval (0,2), one obtains (problem 5.1) the crucial result 


CO 


= (4/2) X [4 pA? + dowr AZ]. (5.36) 


r=1 


Indeed, just as in the discrete case, the total energy of the string can be 
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written as a sum of individual mode energies. We note that the Fourier 
amplitude A, acts as a normal coordinate. Comparing (5.36) with (5.27), we 
see that the string behaves exactly like a system of independent uncoupled 
oscillators, the only difference being that now there are an infinite number 
of them, corresponding to the infinite number of degrees of freedom in the 
continuous field ¢(a,t). The normal coordinates A,(t) are, for many purposes, 
a much more relevant set of degrees of freedom than the original displacements 
p(z, t). 

The final step is to apply quantum mechanics to this classical field sys- 
tem. Once again, the total energy is equivalent to that of a sum of (infinitely 
many) mode oscillators, each of which has to be quantized. The total energy 
eigenvalue has the form (5.29), except that now the sum extends to infinity: 


E= Si + 4)hwr. (5:37) 


r=1 


The excited states of the quantized field ĝ(x, t) are characterized by saying 
how many phonons of each frequency are present; the ground state has no 
phonons at all. We remark that as Z > oo, the mode sum in (5.36) or (5.37) 
will be replaced by an integral over a continuous frequency variable. 

We have now completed, in outline, the programme introduced earlier, 
ending up with the quantization of a ‘mechanical’ system. All of the forego- 
ing, it must be clearly emphasized, is absolutely basic to modern solid state 
physics. The essential idea — quantizing independent modes — can be ap- 
plied to an enormous variety of ‘oscillations’. In all cases the crucial concept 
is the elementary excitation — the mode quantum. Thus we have plasmons 
(quanta of plasma oscillations), magnons (magnetic oscillations), ..., as well 
as phonons (vibrational oscillations). All this is securely anchored in the 
physics of many-body systems. 

Now we come to the use of these ideas as an analogy, to help us understand 
the (presumably non-mechanical) quantum fields with which we shall actually 
be concerned in this book — for example the electromagnetic field. Consider a 
region of space containing electromagnetic fields. These fields obey (a three- 
dimensional version of) the wave equation (5.30), with c now standing for 
the speed of light. By imposing suitable boundary conditions, the total elec- 
tromagnetic energy in any region of space can be written as a sum of mode 
energies. Each mode has the form of an oscillator, whose amplitude is (see 
(5.31)) the Fourier component of the wave, for a given wavelength. These 
oscillators are all quantized. Their quanta are called photons. Thus, a photon 
is an elementary quantum of excitation of the electromagnetic field. 

So far the only kind of ‘particle’ we have in our relativistic quantum field 
theoretic world is the photon. What about the electron, say? Well, recalling 
Feynman again, ‘There is one lucky break, however — electrons behave just 
like light’. In other words, we shall also regard an electron as an elementary 
quantum of excitation of an ‘electron field’. What is ‘waving’ to supply the 


5.2. The quantum field: (ii) Lagrange-Hamilton formulation 125 


vibrations for this electron field? We do not answer this question just as we did 
not for the photon. We postulate a relativistic quantum field for the electron 
which obeys some suitable wave equation — in this case, for non-interacting 
electrons, the Dirac equation. The field is expanded as a sum of Fourier 
components, as with the electromagnetic field. Each component behaves as 
an independent oscillator degree of freedom (and there are, of course, an 
infinite number of them); the quanta of these oscillators are electrons. 

Actually this, though correctly expressing the basic idea, omits one crucial 
factor, which makes it almost fraudulently oversimplified. There is of course 
one very big difference between photons and electrons. The former are bosons 
and the latter are fermions; photons have spin angular momentum of one 
(in unit of ñ), electrons of one-half. It is very difficult, if not downright 
impossible, to construct any mechanical model at all which has fermionic 
excitations. Phonons have spin-1, in fact, corresponding to the three states of 
polarization of the corresponding vibrational waves. But ‘phonons’ carrying 
spin-4 are hard to come by. No matter, you may say, Maxwell has weaned 
us away from jelly, so we shall be grown up and boldly postulate the electron 
field as a basic thing. 

Certainly this is what we do. But we also know that fermionic particles, 
like electrons, have to obey an exclusion principle: no two identical fermions 
can have the same quantum numbers. In chapter 7, we shall learn how the 
idea sketched here must be modified for fields whose quanta are fermions. 


E- U 


5.2 The quantum field: (ii) Lagrange-Hamilton 
formulation 


5.2.1 The action principle: Lagrangian particle mechanics 


We must now make the foregoing qualitative picture more mathematically 
precise. It is clear that we would like a formalism capable of treating, within 
a single overall framework, the mechanics of both fields and particles, in both 
classical and quantum aspects. Remarkably enough, such a framework does 
exist (and was developed long before quantum field theory): Hamilton’s prin- 
ciple of least action, with the action defined in terms of a Lagrangian. We 
strongly recommend the reader with no prior acquaintance with this pro- 
found approach to physical laws read chapter 19 of volume 2 of Feynman’s 
Lectures on Physics (Feynman 1964). 

The least action approach differs radically from the more familiar one 
which can conveniently be called ‘Newtonian’. Consider the simplest case, 
that of classical particle mechanics. In the Newtonian approach, equations 
of motion are postulated which involve forces as the essential physical input; 
from these, the trajectories of the particle can be calculated. In the least 
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qa 


FIGURE 5.4 
Possible space-time trajectories from ‘Here’ (q(t1)) to ‘There’ (q(t2)). 


action approach, equations of motion are not postulated as basic, and the 
primacy of forces yields to that of potentials. The path by which a particle 
actually travels is determined by the postulate (or principle) that it has to 
follow that particular path, out of infinitely many possible ones, for which a 
certain quantity — the action — is minimized. The action S is defined by 


S= f Í L(q(t), ġ(t)) dt (5.38) 


where q(t) is the position of the particle as a function of time, q(t) is its 
velocity and the all-important function L is the Lagrangian. Given L as an 
explicit function of the variables g(t) and q(t), we can imagine evaluating S 
for all sorts of possible q(t)’s starting at time tı and ending at time t2. We 
can draw these different possible trajectories on a q versus t diagram as in 
figure 5.4. For each path we evaluate S: the actual path is the one for which 
S is smallest, by hypothesis. 

But what is L? In simple cases (as we shall verify later) L is just T — V, 
the difference of kinetic and potential energies. Thus for a single particle in a 
potential V 

L= mè? — V (x). (5.39) 


Knowing V(x), we can try and put the ‘action principle’ into action. How- 
ever, how can we set about finding which trajectory minimizes §? It is quite 
interesting to play with some simple specific examples and actually calculate 
S for several ‘fictitious’ trajectories — i.e. ones that we know from the Newto- 
nian approach are not followed by the particle — and try and get a feeling for 
what the actual trajectory that minimizes S might be like (of course it is the 
Newtonian one — see problem 5.2). But clearly this is not a practical answer 
to the general problem of finding the q(t) that minimizes S. Actually, we can 
solve this problem by calculus. 
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Our problem is something like the familiar one of finding the point to at 
which a certain function f(t) has a stationary value. In the present case, 
however, the function S is not a simple function of t — rather it is a function 
of the entire set of points q(t). It is a function of the function q(t), or a 
‘functional’ of q(t). We want to know what particular ‘q.(t)’ minimizes S. 

By analogy with the single-variable case, we consider a small variation dq(t) 
in the path from q(t1) to q(t2). At the minimum, the change 6S corresponding 
to the change dq must vanish. This change in the action is given by 


2 OL OL x, 
s= (aptat) ae = 


Using dq(t) = d(dq(t))/dt and integrating the second term by parts yields 
t2 


5S = [oa 5q(t) aa i a |r| (5.41) 


Since we are considering variations of path in which all trajectories start at tı 
and end at t2, dq(t1) = ôq(t2) = 0. So the condition that S be stationary is 


ðL d OL 
ôq( =a | dt =0. .42 
ôS = [ q(t aa Balt -ia dt = 0 (5.42) 


Since this must be true for arbitrary dq(t), we must have 


OL d OL 


ee 5.43 
TO ENO oe 
This is the celebrated Euler-Lagrange equation of motion. Its solution gives 
the ‘qe(t) which the particle actually follows. 

We can see how this works for the simple case (5.39) where q is the coor- 
dinate x. We have immediately 


OL/0% = mt = p (5.44) 


and 


OL/dx = —OV/dx = F (5.45) 


where p and F are, respectively, the momentum and the force of the Newtonian 
approach. The Euler-Lagrange equation then reads 


F = dp/dt (5.46) 


precisely the Newtonian equation of motion. For the special case of a harmonic 
oscillator (obviously fundamental for the quantum field idea, as section 5.1 
should have made clear), we have 


L= m?’ — mw’? (5.47) 
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which can be immediately generalized to N independent oscillators (see sec- 
tion 5.1) via 


N 
L= X (mQ; — mw Qz). (5.48) 
y= 


For many dynamical systems, the Lagrangian has the form ‘T — V’ indi- 
cated in (5.47) and (5.48). 

Our next step will be to replace classical particle mechanics by quantum 
particle mechanics. The standard way to do this is via the Hamiltonian formu- 
lation of classical mechanics, which we will now briefly review for the simple 
system with Lagrangian (5.39). In Hamiltonian dynamics, the variables used 
are not the Lagrangian ones of position x and velocity t, but rather the po- 
sition x and the canonical momentum p, where p is defined by 


_ OL 

P Fr 

The place of the Lagrangian is taken by the Hamiltonian H (æ, p) which is 
defined by 


(5.49) 


H(z, p) = pt — L. (5.50) 
Using (5.39) for L we find p = mg, and placing this result in (5.50) we obtain 


p 


H(z, p) = T 


+ V(x) (5.51) 
which in this case is just the total energy, expressed in terms of x and p. 
Instead of the Euler-Lagrange equation we have the Hamiltonian equations of 
motion, which are 


~ t (5.52) 
and aH 
ap =P (5.53) 
For the case (5.51) these equations yield 
p/m=« (5.54) 
and 
p=—OV/dz. (5.55) 


Equation (5.54) is just the familiar relation of p to t, and (5.55) is the New- 
tonian equation of motion. In the same way, the reader may check that the 
Hamiltonian for the assembly of oscillators described by the Lagrangian (5.48) 
is 


n= = + =mw?Q?) (5.56) 


where P, = mÒ. 
With this in hand, we turn to quantum particle mechanics. 
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5.2.2 Quantum particle mechanics à la Heisenberg—Lagrange— 
Hamilton 


It seems likely that a particularly direct correspondence between the quantum 
and the classical cases will be obtained if we use the Heisenberg formulation 
(or ‘picture’) of quantum mechanics (see appendix I). In the Schrödinger pic- 
ture, the dynamical variables such as position x are independent of time, 
and the time dependence is carried by the wavefunction. Thus we seem to 
have nothing like the q(t)’s. However, one can always do a unitary trans- 
formation to the Heisenberg picture, in which the wavefunction is fixed and 
the dynamical variables change with time. This is what we want in order to 
parallel the classical quantities q(t). But of course there is one fundamental 
difference between quantum mechanics and classical mechanics: in the former, 
the dynamical variables are operators which in general do not commute. In 
particular, the fundamental commutator states that (A = 1) 


(a(t), pe] =i (5.57) 


where ` indicates the operator character of the quantity. Here f is defined by 
the generalization of (5.44): a 

p = OL/04. (5.58) 
In this formulation of quantum mechanics we do not have the Schrédinger-type 
equation of motion. Instead we have the Heisenberg equation of motion 


A=-i[A, A (5.59) 


where the Hamiltonian operator H is defined in terms of the Lagrangian 
operator L by (cf (5.50)) 


AH = på- L (5.60) 


and Å is any dynamical observable. For example, in the oscillator case 


Ê = dmg — dm’? (5.61) 
p = må (5.62) 

and i j 
A= — p + -mw (5.63) 

2m 2 


which is the total energy operator. Note that p, obtained from the Lagrangian 
using (5.58), had better be consistent with the Heisenberg equation of motion 
for the operator A = g. The Heisenberg equation of motion for A = ĵ leads 
to 

p= —mw?¢ (5.64) 


which is an operator form of Newton’s law for the harmonic oscillator. Using 
the expression for p (5.62), we find 


f= wå. (5.65) 
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Now, although this looks like the familiar classical equation of motion 
for the position of the oscillator — and recovering it from the Lagrangian 
formalism is encouraging — we must be very careful to appreciate that this is 
an equation stating how an operator evolves with time. Where the quantum 
particle will actually be found is an entirely different matter. By sandwiching 
(5.65) between wavefunctions, we can at once see that the average position of 
the particle will follow the classical trajectory (remember that wavefunctions 
are independent of time in the Heisenberg formulation). But fluctuations 
about this trajectory will certainly occur: a quantum particle does not follow 
a ray-like classical trajectory. Come to think of it, neither does a photon! 

In the original formulations of quantum theory, such fluctuations were gen- 
erally taken to imply that the very notion of a ‘path’ was no longer a useful 
one. However, just as the differential equations satisfied by operators in the 
Heisenberg picture are quantum generalizations of Newtonian mechanics, so 
there is an analogous quantum generalization of the ‘path-contribution to the 
action’ approach to classical mechanics. The idea was first hinted at by Dirac 
(1933, 1981, section 32), but it was Feynman who worked it out completely. 
The book by Feynman and Hibbs (1965) presents a characteristically fasci- 
nating discussion — here we only wish to indicate the central idea. We ask: 
how does a particle get from the point q(t,) at time tı to the point q(t2) at 
t2? Referring back to figure 5.4, in the classical case we imagined (infinitely) 
many possible paths q;(t), of which, however, only one was the actual path 
followed, namely the one we called qe(t) which minimized the action integral 
(5.38) as a functional of q(t). In the quantum case, however, we previously 
noted that a particle will no longer follow any definite path, because of quan- 
tum fluctuations. But rather than, as a consequence, throwing away the whole 
idea of a path, Feynman’s insight was to appreciate that the ‘opposite’ view- 
point is also possible: since unique paths are forbidden in quantum theory, we 
should in principle include all possible paths! In other words, we take all the 
trajectories on figure 5.4 as physically possible (together with all the other 
infinitely many ways of accomplishing the trip). 

However, surely not all paths are equally likely: after all, we must presum- 
ably recover the classical trajectory as A — 0, in some sense. Thus we must 
find an appropriate weighting for the paths. Feynman’s recipe is beautifully 
simple: weight each path by the factor 


are (5.66) 


where S' is the action for that particular path. At first sight this is a rather 
strange proposal, since all paths — even the classical one — are weighted by a 
quantity which is of unit modulus. But of course contributions of the form 
(5.66) from all the paths have to be added coherently — just as we superposed 
the amplitudes in the ‘two-slit’ discussion in section 2.5. What distinguishes 
the classical path qe(t) is that it makes S stationary under small changes of 
path: thus in its vicinity paths have a strong tendency to add up construc- 
tively, while far from it the phase factors will tend to produce cancellations. 
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The amount a quantum particle can ‘stray’ from the classical path depends 
on the magnitude of the corresponding action relative to h, the quantum of 
action: the scale of coherence is set by A. 

In summary, then, the quantum mechanical amplitude to go from q(t1) to 
q(t2) is proportional to 


Ss" ep (G / i (att) a(t) dt). (5.67) 


all paths q(t) tı 


There is an evident generalization to quantum field theory. We shall not, 
however, make use of the ‘path integral’ approach to quantum field theory in 
this volume. Its use was, in fact, decisive in obtaining the Feynman rules for 
non-Abelian gauge theories; and it is the only approach suitable for numerical 
studies of quantum field theories (how can operators be simulated numeri- 
cally?). Nevertheless, for a first introduction to quantum field theory, there 
is still much to be said for the traditional approach based on ‘quantizing the 
modes’, and this is the path we shall follow in the rest of this volume. Not the 
least of its advantages is that it contains the intuitively powerful ‘calculus’ of 
creation and annihilation operators, as we now describe. We shall return to 
the path integral formalism in chapter 16 of volume 2. 


5.2.3 Interlude: the quantum oscillator 


As we saw in section 5.1, we need to know the energy spectrum and associated 
states of a quantum harmonic oscillator. This is a standard problem, but there 
is one particular way of solving it — the ‘operator’ approach due to Dirac (1981, 
chapter 6) — that is so crucial to all subsequent development that we include 
a discussion here in the body of the text. 

For the oscillator Hamiltonian 


A= 1g + ee (5.68) 
2m 2 
if p and ĝ were not operators, we could attempt to factorize the Hamiltonian 
in the form ‘(q + ip)(q — ip)’ (apart from the factors of 2m and w). In the 
quantum case, in which p and ĝ do not commute, it still turns out to be very 
helpful to introduce such combinations. If we define the operator 


1 i 
â = — | Vmwĝ + ——p 5.69 
v2 ( i =i) 68) 
and its Hermitian conjugate 
1 i 
at — A ie 
â! = — | /mwg — — 5.70 
v2 ( É =?) ee) 


the Hamiltonian may be written as (see problem 5.4) 


H = (âà + aatyw = (ata + dy. (5.71) 
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The second form for H may be obtained from the first using the commutation 


relation between @ and ât 
lâ, â] =1 (5.72) 


derived using the fundamental commutator between p and ĝ. Using this ba- 
sic commutator (5.72) and our expression for H, (5.71), one can prove the 
relations (see problem 5.4) 


[H, a] = -wâ 
. (5.73) 
[Hâ] = wat. 
Consider now a state |n) which is an eigenstate of H with energy Ep: 


H|n) = E,|n). (5.74) 


Using this definition and the commutators (5.73), we can calculate the energy 
of the states (a@‘|n)) and (@|n)). We find 


H(a'\n)) = (En +w)("In)) (5.75) 
H(aln)) = (En ~ w)(@|n)). (5.76) 


Thus the operators ât and â respectively raise and lower the energy of |n} by 
one unit of w (h = 1). Now since H ~ p? + @ with p and ĝ Hermitian, we can 
prove that (Y| Ñ|) is positive-definite for any state |Y}. Thus the operator â 
cannot lower the energy indefinitely: there must exist a lowest state |0) such 
that 

âl0} = 0. (5.77) 


This defines the lowest-energy state of the system; its energy is 
H|0) = 4w|0) (5.78) 
the ‘zero-point energy’ of the quantum oscillator. The first excited state is 
|1) = a"0) (5.79) 


with energy (1+ 4)w. The nth state has energy (n + $)w and is proportional 
to (a')"|0). To obtain a normalization 


(n|n) = 1 (5.80) 
the correct normalization factor can be shown to be (problem 5.4) 


1 
vn! 


Returning to the eigenvalue equation for H, we have arrived at the result 


In) = —=(at)"0). (5.81) 


H\n) = (a4 + 4)w|n) = (n + 4)w|n) (5.82) 
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so that the state |n) defined by (5.81) is an eigenstate of the number operator 
fh = ata, with integer eigenvalue n: 


n\n) = n\n). (5.83) 
It is straightforward to generalize all the foregoing to a system whose 
Lagrangian is a sum of N independent oscillators, as in (5.48): 


N 
=Y (mä, — muzg?) (5.84) 
r=1 
The required generalization of the basic commutation relations (5.57) is 
[drs Ps] = irs 
S a (5.85) 
[ars âs] = [Pr, Ps] = 0 


since the different oscillators labelled by the index r or s are all independent. 
The Hamiltonian is (cf (5.56)) 


Í 
ms 


A [(1/2m) pF + mu? 47] (5.86) 
r=1 
N 

= X (âlâ, + $)w, (5.87) 
r=1 


with â, and âf defined via the analogues of (5.69) and (5.70). Since the 
eigenvalues of each number operator 2, = âlâ, are nr, by the previous results, 
the eigenvalues of H indeed have the form (5.29), 


N 
E= Do + d)wp. (5.88) 


The corresponding eigenstates are products |nı}|n2):---|nn} of N individ- 
ual oscillator eigenstates, where |n,) contains n, quanta of excitation, of fre- 
quency wr; the product state is usually abbreviated to |n1, n2,..., nn). In the 
ground state of the system, each individual oscillator is unexcited: this state 
is |0,0,...,0), which is abbreviated to |0), where it is understood that 


a,\0) = 0 for all r. (5.89) 


The operators â} create oscillator quanta; the operators â, destroy oscillator 
quanta. 


5.2.4 Lagrange—Hamilton classical field mechanics 


We now consider how to use the Lagrange-Hamilton approach for a field, 
starting again with the classical case and limiting ourselves to one dimension 
to start with. 
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FIGURE 5.5 
The passage from a large number of discrete degrees of freedom (mass points) 
to a continuous degree of freedom (field). 


As explained in the previous section, we shall have in mind the N —> oo 
limit of the N degrees of freedom case 


{q,(t);r = 1,2,...,N} —+ 9(a,t) (5.90) 


where x is now a continuous variable labelling the displacement of the ‘string’ 
(to picture a concrete system, see figure 5.5). At each point x we have an 
independent degree of freedom ¢(a, t) — thus the field system has a ‘continuous 
infinity’ of degrees of freedom. We now formulate everything in terms of a 
Lagrangian density £: 


S= J dt L (5.91) 
where (in one dimension) 


is fas (5.92) 


Equation (5.90) suggests that ¢ has dimension of [length], and since in the 
discrete case L = T — V, £ has dimension [energy/length]. (In general £ has 
dimension [energy /volume].) 
A new feature arises because ¢ is now a continuous function of x, so that 
L can depend on 0¢/0x as well as on ¢ and ¢ = 04/dt: L = L(G, 0¢/dz, $). 
As before, we postulate the same fundamental principle 


55 =0 (5.93) 


meaning that the dynamics of the field ¢ is governed by minimizing S. This 
time the total variation is given by 


ôS = Jaf [ee + xan (32) + 59] da. (5.94) 


Integrating the 5¢ by parts in t, and the 6(0¢/0z) by parts in x, and discarding 
the resulting ‘surface’ terms, we obtain 


ss= fat f arse 35- o E (ae e 
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Since 6¢@ is an arbitrary function, the requirement 6S = 0 yelds the Euler- 
Lagrange field equation 


OL oð OL o (ƏL 
36 ~ Bx (a05) a (3g) T" e 
The generalization to three dimensions is 
OL OL ð (ƏL 
a Y: (awa) a (5a) aa 
As an example, consider 
_1 (æ 1 4 (a¢) 


where the factor p (mass density) and c (a velocity) have been introduced to 
get the dimension of £ right. Inserting this into the Euler—Lagrangian field 
equation (5.96), we obtain 


2 2 
Te ae (5.99) 
ðr? e OP 

which is precisely the wave equation (5.30) for the one-dimensional string, 
now obtained via the Euler-Lagrange field equations. Note that the Lagrange 
density £ has the expected form (cf (5.48)) of ‘kinetic energy density minus 
potential energy density’. 

For the final step — the passage to quantum mechanics for a field system 
— we shall be interested in the Hamiltonian (total energy) of the system, 
just as we were for the discrete case. Though we shall not actually use the 
Hamiltonian in the classical field case, we shall introduce it here, generalizing 
it to the quantum theory in the following section. We recall that Hamiltonian 
mechanics is formulated in terms of coordinate variables (‘q’) and momentum 
variables (‘p’), rather than the q and ġ of Lagrangian mechanics. In the 
continuum (field) case, the Hamiltonian H is written as the integral of a 
density H (we remain in one dimension) 


H= f act (5.100) 


while the coordinates qr (t) become the ‘coordinate field’ ¢(x, t). The question 
is what is the corresponding ‘momentum field’? 

The answer to this is provided by a continuum version of the generalized 
momentum derived from the Lagrangian approach (cf equation (5.44)) 


p = OL /04. (5.101) 
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We define a ‘momentum field’ m(a#,t) — technically called the ‘momentum 
canonically conjugate to @’ — 


n(x,t) = OL/06(x, t) (5.102) 


where £ is now the Lagrangian density. Note that m has dimensions of a 
momentum density. In the classical particle mechanics case we define the 
Hamiltonian by 

H(p, q) = på — L. (5.103) 


Here we define a Hamiltonian density H by 


Hlo, T) = n(x, t)d(a, t) — L. (5.104) 
Let us see how all this works for the one-dimensional string with £ given 
by 
1 faey 1a (dey 
We have 
n(x, t) = pop/Ot (5.106) 
and 
med 
P p 2 7 x 
1 ap? 
2 2 
= 2 a A 
a + pc (32) (5.107) 
so that 


H, = [ Ere + T (429) dz. (5.108) 


This has exactly the form we expect (see (5.35)), thus verifying the plausibility 
of the above prescription. 

Inserting the mode expansion (5.34) into (5.92) and (5.105) we obtain the 
result (just as in (5.36) and problem 5.1) 


l foe) 

£ 

Lp =| dx Lp = 5 X È pa = seat] (5.109) 
0 ral 


confirming that the system is equivalent to an infinite number of oscillators. 
The momentum canonically conjugate to A, is 


P a 
r = = = = Ay 5.110 
Pr = iT? ( ) 
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and the Hamiltonian is 


oO 


A= fl ae (5.111) 
P lp 4 aiaia 


p= 


We may cast (5.111) into nicer form by the change of variables 


P, = 2] l Ppr; Qr =y L2 Ar (5.112) 
in terms of which 


P? ı 
H=) a 5 Pun Or (5.113) 


just as in (5.56), with N — oo. 


5.2.5 Heisenberg—Lagrange—Hamilton quantum field 
mechanics 


Finally, we are ready to quantize classical field formalism, and arrive at a 
quantum field mechanics — at least for the scalar field d(x,t). If we were 
dealing with the case in which ¢(a,t) represented the displacement of a one- 
dimensional stretched string, quantization would be straightforward. We 
would take the classical Hamiltonian (5.113) and promote the mode coordi- 
nates Q,. and their conjugate momenta P,. to operators satisfying commutation 
relations of the form (5.85). The rest of the analysis would be exactly as in 
equations (5.86) to (5.89), except that the number of modes N is infinite. But 
in the case of the general scalar field, we do not want to impose the boundary 
conditions ¢(0,t) = ¢(¢,t) = 0, which led to the mode expansion (5.34). It is 
then not so clear how to proceed. 

Fortunately, the Lagrange-Hamilton field formalism does indicate the way 
forward, which is one good reason for developing it in the first place. (Another 
is that it is very well suited to the analysis of symmetries, a crucial aspect 
of gauge theories — see chapter 7.) In the previous section we introduced the 
‘coordinate-like’ field ¢(#,t) and (via the Lagrangian) the ‘momentum-like’ 
field 7(a,t). To pass to the quantized version of the field theory, we mimic 
the procedure followed in the discrete case and promote both the quantities @ 
and 7 to operators ĝ and ĝ, in the Heisenberg picture. As usual, the distinctive 
feature of quantum theory is the non-commutativity of certain basic quantities 
in the theory — for example, the fundamental commutator (fi = 1) 


[dr (t), Bs ()] = idrs (5.114) 


of the discrete case. Thus we expect that the operators ĝ and 7 will obey 
some commutation relation which is a continuum generalization of (5.114). 
The commutator will be of the form [@(a,t),#(y,#)], since — recalling fig- 
ure 5.5 — the discrete index r or s becomes the continuous variable x or y; we 
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also note that (5.114) is between operators at equal times. The continuum 
generalization of the rs symbol is the Dirac ô function, 6(x — y), with the 
properties 


© o(2) dx =1 (5.115) 
SE (x — y) f(x) dx = f(y) (5.116) 


for all reasonable functions f (see appendix E). Thus the fundamental com- 
mutator of quantum field theory is taken to be 


[lz t), #(y,t)] = iô(z — y) (5.117) 


in the one-dimensional case, with obvious generalization to the three-dimen- 
sional case via the symbol ô? (æ — y). Remembering that we have set h = 1, 
it is straightforward to check that the dimensions are consistent on both 
sides. Variables ¢ and 7 obeying such a commutation relation are said to 
be ‘conjugate’ to each other. 

What about the commutator of two Q's or two 7’s? In the discrete case, 
two different @’s (in the Heisenberg picture) will commute at equal times, 
lâ- (t), Gs(t)] = 0, and so will two different f’s. We therefore expect to supple- 
ment (5.117) with 


(A(z, t), O(y, t)] = [i (x,t), 7(y, t)] = 0. (5.118) 


Let us now proceed to explore the effect of these fundamental commutator 
assumptions, for the case of the Lagrangian density which yielded the wave 
equation via the Euler-Lagrange equations, namely 


a\2 aA 2 
ie Lp @ — 500 (2) , (5.119) 
If we remove p, and set c = 1, we obtain 
1 (ae\’ 1 fad) 
Ê= (=) = (32) (5.120) 
for which the Euler-Lagrangian equation yields the field equation 
ua — ta - 0, (5.121) 


We can think of (5.121) as a highly simplified (spin-0, one-dimensional) ver- 
sion of the wave equation satisfied by the electromagnetic potentials. We 
may guess, then, that the associated quanta are massless, as we shall soon 
confirm. 
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The Lagrangian density (5.120) is our prototype quantum field Lagrangian 
(one often slips into leaving out the word ‘density’). Applying the quantized 
version of (5.95) we then have 


(0,1) = > = ġe, t) (5.122) 
dgx, t) 
and the Hamiltonian density is 
: 1 1 [ 06 i 
s aj p _ la 1/86 
H=îp-L z" + 5 ( $) i (5.123) 


The total Hamiltonian is 


ay 2 
f= fna- + (2) da. (5.124) 


It is not immediately clear how to find the eigenvalues and eigenstates of 
the operator H. However, it is exactly at this point that all our preliminary 
work on normal modes comes into its own. If we can write the Hamiltonian as 
some kind of sum over independent oscillators — i.e. modes — we shall know how 
to proceed. For the classical string with fixed end points which was considered 
in section 5.1, the mode expansion was simply a Fourier expansion. In the 
present case, we want to allow the field to extend throughout all of space, 
without the periodicity imposed by fixed-end boundary conditions. In that 
case, the Fourier series is replaced by a Fourier integral, and standing waves 
are replaced by travelling waves. For the classical field obeying the wave 
equation (5.30) there are plane-wave solutions 


olx, t) x el? 5.125) 


where (c = 1) 
w=k 5.126) 


which is just the dispersion relation of light in vacuo. The general field may 
be Fourier expanded in terms of these solutions: 


a o2 dk ikx—iwt * —ikx+iwt 

olx, t) = J. mema +a” (k)e ] 5.127) 
where we have required ¢ to be real. (The rather fussy factors (27/2w)7! 
are purely conventional, and determine the normalization of the expansion 
coefficients a, a* and â, ât later; in turn, the latter enter into the definition, 
and normalization, of the states — see (5.143)). Similarly, the ‘momentum 
field’? 7 = db is expanded as 


T= i Sa (ipo = a* (keke tint, (5.128) 
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We quantize these mode expressions by promoting ¢ > d, Tt — 7 and assum- 
ing the commutator (5.117). Thus we write 


T a dk ^ ika—iw ^ —ikz+iw 
ô= f zo t +â? (ke terior) (5.129) 


and similarly for 7. The commutator (5.117) now determines the commutators 
of the mode operators â and ât: 


[a(k), al (k’)] = 2rô(k — k’) 


(5.130) 
[a(k), @(k’)] = [ât (k), ât (k')] = 0 
as shown in problem 5.6. These are the desired continuum analogues of the 
discrete oscillator commutation relations 


lân, âi] = Ors 


lâ., âs] = [at, at] = 0. 


Ti =g 


(5.131) 


The precise factor in front of the 6-function in (5.130) depends on the normal- 
ization choice made in the expansion of ĝ, (5.129). Problem 5.6 also shows 
that the commutation relations (5.130) lead to (5.118) as expected. 

The form of the â, ât commutation relations (5.130) already suggests that 
the â(k) and ât (k) operators are precisely the single-quantum destruction and 
creation operators for the continuum problem. To verify this interpretation 
and find the eigenvalues of H , we now insert the expansion for ĝ and 7 into 
H of (5.124). One finds the remarkable result (problem 5.7) 


a ~ dk {1 
H= / = {slat Hac + ajat e)l} i (5.132) 
-oo 2T 
Comparing this with the single-oscillator result 
H = h(ala+ aalyw (5.133) 


shows that, as anticipated in section 5.1, each classical mode of the field can 
be quantized, and behaves like a separate oscillator coordinate, with its own 
frequency w = k. The operator ât (k) creates, and @(k) destroys, a quantum 
of the k mode. The factor (27)~! in H arises from our normalization choice. 

We note that in the field operator œ of (5.129), those terms which destroy 
quanta go with the factor e~'”’, while those which create quanta go with 
eti#t This choice is deliberate and is consistent with the ‘absorption’ and 
‘emission’ factors e+”! of ordinary time-dependent perturbation theory in 
quantum mechanics (cf equation (A.33) of appendix A). 

What is the mass of these quanta? We know that their frequency w is 
related to their wavenumber k by (5.126), which — restoring f’s and c’s — can 
be regarded as equivalent to iw = hck, or E = cp, where we use the Einstein 
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and de Broglie relations. This is precisely the E—p relation appropriate to a 
massless particle, as expected. 

What is the energy spectrum? We expect the ground state to be deter- 
mined by the continuum analogue of 


G,|0) = 0 for all r; (5.134) 


namely 
a(k)|0) = 0 for all k. (5.135) 


However, there is a problem with this. If we allow the Hamiltonian of (5.132) 
to act on |0} the result is not (as we would expect) zero, because of the 
a(k)a'(k) term (the other term does give zero by (5.135)). In the single 
oscillator case, we rewrote aa" in terms of âtâ by using the commutation 
relation (5.72), and this led to the ‘zero-point energy’, $w, of the oscillator 


ground state. Adopting the same strategy here, we write H of (5.132) as 
a dk dk 1 
H = I — ât (k)a(k)w + / —=|a(k), al (k)]w. (5.136) 


Now consider H|0): we see from the definition of the vacuum (5.135) that the 
first term will give zero as expected — but the second term is infinite, since the 
commutation relation (5.130) produces the infinite quantity ‘d(0)’ as k > k’; 
moreover, the k integral diverges. 

This term is obviously the continuum analogue of the zero-point energy 4w 
— but because there are infinitely many oscillators, it is infinite. The conven- 
tional ploy is to argue that only energy differences, relative to a conveniently 
defined ground state, really matter — so that we may discard the infinite con- 
stant in (5.136). Then the ground state |0) has energy zero, by definition, and 
the eigenvalues of H are of the form 


J Lan (5.137) 


Qn 


where n(k) is the number of quanta (counted by the number operator a! (k)@(k)) 
of energy w = k. For each definite k, and hence w, the spectrum is like that of 
the simple harmonic oscillator. The process of going from (5.132) to (5.136) 
without the second term is called ‘normally ordering’ the â and ât operators: 
in a ‘normally ordered’ expression, all ât’s are to the left of all a’s, with the 
result that the vacuum value of such expressions is by definition zero. 

It has to be admitted that the argument that only energy differences matter 
is false as far as gravity is concerned, which couples to all sources of energy. 
It would ultimately be desirable to have theories in which the vacuum energy 
came out finite from the start (as actually happens in ‘supersymmetric’ field 
theories — see for example Weinberg (1995), p 325); see also comment (3). 

We proceed on to the excited states. Any desired state in which excitation 
quanta are present can be formed by the appropriate application of ât (k) op- 
erators to the ground state |0). For example, a two-quantum state containing 
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one quantum of momentum kı and another of momentum kz may be written 
(cf (5.81)) 
lki, k2) œ â? (kı Jâ (ke) |0). (5.138) 


A general state will contain an arbitrary number of quanta. 

Once again, and this time more formally, we have completed the pro- 
gramme outlined in section 5.1, ending up with the ‘quantization’ of a classical 
field (x,t), as exemplified in the basic expression (5.129), together with the 
interpretation of the operators @(k) and ât (k) as destruction and creation op- 
erators for mode quanta. We have, at least implicitly, still retained up to this 
point the ‘mechanical model’ of some material object oscillating — some kind 
of infinitely extended ‘jelly’. We now throw away the mechanical props and 
embrace the unadorned quantum field theory! We do not ask what is waving, 
we simply postulate a field — such as @ — and quantize it. Its quanta of excita- 
tion are what we call particles — for example, photons in the electromagnetic 
case. 

We end this long section with some further remarks about the formalism, 
and the physical interpretation of our quantum field d. 


Comment (1) 


The alert reader, who has studied appendix I, may be worried about the 
following (possible) consistency problem. The fields ¢ and 7 are Heisenberg 
picture operators, and obey the equations of motion 


f(a,t) = —i[8(x,t), Ê] (5.139) 
Âe, t) = —il#(a,t), Ê] (5.140) 


where Ñ is given by (5.132). It is a good exercise to check (problem 5.8(a)) 


that (5.139) yields just the expected relation (x,t) = #(2,t) (cf (5.122)). 
Thus (5.140) becomes 


(a, t) = —ift(a,t), Ê]. (5.141) 


However, we have assumed in our work here that éb obeyed the wave equation 


(cf.(5.121)) 


zs O- a 
$ = êle, t) (5.142) 


as a consequence of the quantized version of the Euler-Lagrange equation (5.96). 
Thus the right-hand sides of (5.141) and (5.142) need to be the same, for con- 
sistency — and they are: see problem 5.8(b). Thus — at least in this case — 
the Heisenberg operator equations of motion are consistent with the Euler— 
Lagrange equations. 


Comment (2) 


Following on from this, we may note that this formalism encompasses both 
the wave and the particle aspects of matter and radiation. The former is evi- 
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dent from the plane-wave expansion functions in the expansion of ¢, (5.129), 
which in turn originate from the fact that ¢ obeys the wave equation (5.121). 
The latter follows from the discrete nature of the energy spectrum and the 
associated operators â, ât which refer to individual quanta i.e. particles. 


Comment (3) 


Next, we may ask: what is the meaning of the ground state |0) for a quantum 
field? It is undoubtedly the state with n(k) = 0 for all k, i.e. the state with 
no quanta in it — and hence no particles in it, on our new interpretation. It is 
therefore the vacuum! As we shall see later, this understanding of the vacuum 
as the ground state of a field system is fundamental to much of modern particle 
physics — for example, to quark confinement and to the generation of mass for 
the weak vector bosons. Note that although we discarded the overall (infinite) 
constant in H, differences in zero-point energies can be detected; for example, 
in the Casimir effect (Casimir 1948, Kitchener and Prosser 1957, Sparnaay 
1958, Lamoreaux 1997, 1998). These and other aspects of the quantum field 
theory vacuum are discussed in Aitchison (1985). 


Comment (4) 


Consider the two-particle state (5.138): |k1, k2) œ @(k1)a@'(k2)|0). Since the 
a‘ operators commute, (5.130), this state is symmetric under the interchange 
kı © kə. This is an inevitable feature of the formalism as so far developed — 
there is no possible way of distinguishing one quantum of energy from another, 
and we expect the two-quantum state to be indifferent to the order in which 
the quanta are put in it. However, this has an important implication for 
the particle interpretation: since the state is symmetric under interchange 
of the particle labels kı and kz, it must describe identical bosons. How the 
formalism is modified in order to describe the antisymmetric states required 
for two fermionic quanta will be discussed in section 7.2. 


Comment (5) 


Finally, the reader may well wonder how to connect the quantum field theory 
formalism to ordinary ‘wavefunction’ quantum mechanics. The ability to see 
this connection will be important in subsequent chapters and it is indeed quite 
simple. Suppose we form a state containing one quantum of the db field, with 
momentum k’: 


|k’) = Nat (k’)|0) (5.143) 


where N is a normalization constant. Now consider the amplitude (0|4(, t)|k’). 
We expand this out as 


(O|o(a, t) J)k’) = (0 f be) a pie + at (kee tiet] Nat (k’)|0). 
(5.144) 
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The ‘ata!’ term will give zero since (0|ât = 0. For the other term we use the 
commutation relation (5.130) to write it as 


ol [ZE tat ayaa) + rat- eye eto = nE 
—— [â' (k')â 70(k — k’)le = N——. 
27 2w V 2w" 


using the vacuum condition once again, and integrating over the ô function 
using the property (5.116) which sets k = k’ and hence w = w’. The vacuum 
is normalized to unity, (0|0) = 1. The normalization constant N can be 
adjusted according to the desired convention for the normalization of the 
states and wavefunctions. The result is just the plane-wave wavefunction for 
a particle in the state |k’)! Thus we discover that the vacuum to one-particle 
matrix elements of the field operators are just the familiar wavefunctions of 
single-particle quantum mechanics. In this connection we can explain some 
common terminology. The path to quantum field theory that we have followed 
is sometimes called ‘second quantization’ — ordinary single-particle quantum 
mechanics being the first-quantized version of the theory. 


(5.145) 


E 
5.3 Generalizations: four dimensions, relativity and mass 


In the previous section we have shown how quantum mechanics may be mar- 
ried to field theory, but we have considered only one spatial dimension, for 
simplicity. Now we must generalize to three and incorporate the demands of 
relativity. This is very easy to do in the Lagrangian approach, for the scalar 
field d(a,t). ‘Scalar’ means that the field has only one independent com- 
ponent at each point (x,t) — unlike the electromagnetic field, for instance, 
for which the analogous quantity has four components, making up a 4-vector 
field A“ (æ, t) = (Ao(a,t), A(x,t)) (see chapter 7). In the quantum case, a 
one-component field (or wavefunction) is appropriate for spin-0 particles. 

As we saw in (5.97), the three-dimensional Euler-Lagrange equations are 


OL OL ð (OL 
—-V- = - (J = 5.146 
Ob O(Ve) ôt (5) ( ) 
which may immediately be rewritten in relativistically invariant form 
OL OL 
— —0, | = } = 90 5.147 
35 a (aD) pone 


where 0,, = 0/0x". Similarly, the action 


S= fu ferec = f dec (5.148) 
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will be relativistically invariant if £ is, since the volume element d*z is in- 
variant. Thus, to construct a relativistic field theory, we have to construct 
an invariant density £ and use the already given covariant Euler-Lagrange 
equation. Thus our previous string Lagrangian 


1 /ðp\? 1 ap? 
Lp = 5P (3) — 300° (#2) (5.149) 


with p = c = 1 generalizes to 


L= 40,60" ¢ (5.150) 
and produces the invariant wave equation 


All of this goes through just the same when the fields are quantized. 

This invariant Lagrangian describes a field whose quanta are massless. 
To find the Lagrangian for the case of massive quanta, we need to find the 
Lagrangian that gives us the Klein—Gordon equation (see section 3.1) 


(A +m?) d(x,t) =0 (5.152) 


via the Euler-Lagrangian equations. 
The answer is a simple generalization of (5.150): 


Lia = $0.60" — img’. (5.153) 
The plane-wave solutions of the field equation — now the KG equation — have 
frequencies (or energies) given by 
w? =k? + m? (5.154) 
which is the correct energy-momentum relation for a massive particle. 


How do we quantize this field theory? The four-dimensional analogue of 
the Fourier expansion of the field ¢ takes the form 


y = dk a —ik-x “ ik-x 
jt) = [alate + atime] (6155) 


with a similar expansion for the ‘conjugate momentum’ 7 = @: 


A — > d’k —iw\lé eika — ât eik-« 
a(a)= f Bane Ha) (kel, (5.156) 


Here k -x is the four-dimensional dot product k- x = wt —k-a, and w = 
+(k? + m?)!/2, The Hamiltonian is found to be 


Hxc = [Sette = J dri? + Ve- V+ me? (5.157) 
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and this can be expressed in terms of the a’s and the a'’s using the expansion 
for @ and 7 and the commutator 


[a(k), at (k’)] = (27)363(k — k’) (5.158) 
with all others vanishing. The result is, as expected, 


Hxc = 5 / —— al (k)a(k) + a(k)at (k)|w (5.159) 


and, normally ordering as usual, we arrive at 


n 3 
Hga = | apt Hane, (5.160) 


This supports the physical interpretation of the mode operators ât and â as 
creation and destruction operators for quanta of the field ¢ as before, except 
that now the energy-momentum relation for these particles is the relativistic 
one, for particles of mass m. 

Since @ is real ($ = ot) and has no spin degrees of freedom, it is called 
a real scalar field. Only field quanta of one type enter — those created by 
ât and destroyed by â. Thus db would correspond physically to a case where 
there was a unique particle state of a given mass m — for example the 7° field. 
Actually, of course, we would not want to describe the 7° in any fundamental 
sense in terms of such a field, since we know it is not a point-like object (‘o 
is defined only at the single space-time point (a,t)). The question of whether 
true ‘elementary’ scalar fields exist in nature is an interesting one: in the 
Standard Model, as we shall eventually see in volume 2, the Higgs field is a 
scalar field (though it contains several components with different charge). It 
remains to be seen if this field — and the associated quantum, the Higgs boson 
—is a scalar, and if so whether it is elementary or composite. 

We have learned how to describe free relativistic spinless particles of finite 
mass as the quanta of a relativistic quantum field. We now need to understand 
interactions in quantum field theory. 


m iii i i 
Problems 
5.1 Verify equation (5.36). 


5.2 Consider one-dimensional motion under gravity so that V(x) = —mgz in 
(5.39). Evaluate S of (5.38) for tı = 0, t2 = to, for three possible trajectories: 


(a) a 
(b) x(t) = Sgt? (the Newtonian result) and 
) 


(c 


Problems 


where the constants a and 6 are to be chosen so that all the trajectories end 


at the same point x(t). 


5.3 
(a) 


(a) 


5.5 Treating w and w* as independent classical fields, show that the La- 


Use (5.57) and (5.63) to verify that 


Ê= må 


is consistent with the Heisenberg equation of motion for A= ĝ: 


By similar methods verify that 


p = -mw?°ĝå. 


Rewrite the Hamiltonian H of (5.63) in terms of the operators â 
and a. 


Evaluate the commutator between â and ât and use this result 
together with your expression for H from part (a) to verify equa- 
tion (5.73). 


Verify that for |n) given by equation (5.81) the normalization con- 
dition 

(njn) = 1 
is satisfied. 


Verify (5.83) directly using the commutation relation (5.72). 


grangian density 


L= iy — (1/2m)Vy* -VY 


gives the Schrodinger equation for ~ and y* correctly. 


5.6 
(a) 


(b) 


Verify that the commutation relations for @(k) and ât (k) (equations 
(5.130)) are consistent with the equal time commutation relation 
between ¢ and 7 (equation (5.117)), and with (5.118). 


Consider the unequal time commutator D(z1, x2) = [¢(a1,t1), 

(x2, t2)], where ¢ is a massive KG field in three dimensions. Show 
that 

d?k 

D = —ik-(a1—a2) _ ,ik-(a1—2x2) 5.161 

(ena) = | oop dre) (5161) 

where k - (zı = x2) = E(t, = t2) —k.- (xı = 22), and E = (k? + 

m?)'/2. Note that D is not an operator, and that it depends only 


147 
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on the difference of coordinates xı — x2, consistent with translation 
invariance. Show that D(x,,x2) vanishes for tı = t2. Explain why 
the right-hand side of (5.161) is Lorentz invariant (see the exercise 
in appendix E), and use this fact to show that D(x1, £2) vanishes 
for all space-like separations (xı — x2)? < 0. Discuss the significance 
of this result — or see the discussion in section 6.3.2! 


5.7 Insert the plane-wave expansions for the operators ĝ and 7 into the equa- 
tion for H, (5.124), and verify equation (5.132). [Hint: note that w is defined 
to be always positive, so that (5.126) should strictly be written w = |k].] 


5.8 
(a) Use (5.117) and (5.124) to verify that î(x, t) = ha, t) is consistent 


with the Heisenberg equation of motion for (a, t). [Hint: write the 
integral in (5.124) as over y, not 2!] 


(b) Similarly, verify the consistency of (5.141) and (5.121). 


6 


Quantum Field Theory II: Interacting Scalar 
Fields 


6.1 Interactions in quantum field theory: qualitative 
introduction 


In the previous chapter we considered only free — i.e. non-interacting — quan- 
tum fields. The fact that they are non-interacting is evident in a number of 
ways. The mode expansions (5.129) and (5.155) are written in terms of the 
(free) plane-wave solutions of the associated wave equations. Also the Hamil- 
tonians turned out to be just the sum of individual oscillator Hamiltonians 
for each mode frequency, as in (5.132) or (5.159). The energies of the quanta 
add up — they are non-interacting quanta. Finally, since the Hamiltonians are 
just sums of number operators 


fu(k) = a (k)a(k) (6.1) 


it is obvious that each such operator commutes with the Hamiltonian and is 
therefore a constant of the motion. Thus two waves, each with one excitation 
quantum, travelling towards each other will pass smoothly through each other 
and emerge unscathed on the other side — they will not interact at all. 

How can we get the mode quanta to interact? If we return to our dis- 
cussion of classical mechanical systems in section 5.1, we see that the crucial 
step in arriving at the ‘sum over oscillators’ form for the energy was the as- 
sumption that the potential energy was quadratic in the small displacements 
qr. We expect that ‘modes will interact’ when we go beyond this harmonic 
approximation. The same is true in the continuous (wave or field) case. In the 
derivation of the appropriate wave equation you will find that somewhere an 
approximation like tang ~% ¢ or sing © ¢ is made. This linearizes the equa- 
tion, and solutions to linear equations can be linearly superposed to make new 
solutions. If we retain higher powers of ¢, such as ¢°, the resulting nonlinear 
equation has solutions that cannot be obtained by superposing two indepen- 
dent solutions. Thus two waves travelling towards each other will not just 
pass smoothly through each other: various forms of interaction and distortion 
of the original waveforms will occur. 

What happens when we quantize such anharmonic systems? To gain some 
idea of the new features that emerge, consider just one ‘anharmonic oscillator’ 
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with Hamiltonian 
= (1/2m)p? + 4mw?@? + dq’. (6.2) 


In terms of the â and ât combinations this becomes 


1 
H = zÂ â+ aa! w + (â + ât)’ (6.3) 


= Hy+ AH’ (6.4) 


where Ho is our previous free oscillator Hamiltonian. The algebraic tricks we 
used to find the spectrum of Ap do not work for this new H because of the 
addition of the H’ interaction term. In particular, although Ho commutes with 
the number operator âtâ, H’ does not. Therefore, whatever the eigenstates of 
H are, they will not in general have a definite number of ‘Ho quanta’. In fact, 
we cannot find an exact algebraic solution to this new eigenvalue problem, 
and we must resort to perturbation theory or to numerical methods. 

The perturbative solution to this problem treats AH! asa perturbation 
and expands the true eigenstates of H in terms of the eigenstates of Ho: 


7) = X Galt). (6.5) 


n 


From this expansion we see that, as expected, the true eigenstates |7) will 
‘contain different numbers of Ho quanta’: |crn|? is the probability of finding n 
‘Ho quanta’ in the state |r). Perturbation theory now proceeds by expanding 
the coefficients c,n and exact energy eigenvalues E, as power series in the 
strength A of the perturbation. For example, the exact energy eigenvalue has 
the expansion 


E, = EO + BO) +? BO) +. (6.6) 
where 
Ho|r) = E |r) (6.7) 
and 
ED = (ofr) (6.8) 
E® = 5 (r| H'|s) (s|H'|r} : (6.9) 


(0) (0) 
sr Er = Es 


To evaluate the second-order shift in energy, we therefore need to consider 
matrix elements of the form 


(s|(â + at)3|r). (6.10) 


Keeping careful track of the order of the â and ât operators, we can evaluate 
these matrix elements and find, in this case, that there are non-zero matrix 
elements for states (s| = (r + 3|, (r + 1|, (r — 1| and (r — 3]. 
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What about the quantum mechanics of two coupled nonlinear oscillators? 
In the same way, the general state is assumed to be a superposition 


7) = 5 Cr,ninz|M1)|N2) (6.11) 


n1,n2 


of states of arbitrary numbers of quanta of the unperturbed oscillator Hamil- 
tonians Ho) and Hoa). States of the unperturbed system contain definite 
numbers nı and na, say, of the ‘1’ and ‘2’ quanta. Perturbation calculations of 
the interacting system will involve matrix elements connecting such |n)|n2) 
states to states |n{)|n5) with different numbers of these quanta. 

All this can be summarized by the remark that the typical feature of 
quantized interacting modes is that we need to consider processes in which 
the numbers of the different mode quanta are not constants of the motion. 
This is, of course, exactly what happens when we have collisions between 
high-energy particles. When far apart the particles, definite in number, are 
indeed free and are just the mode quanta of some quantized fields. But, when 
they interact, we must expect to see changes in the numbers of quanta, and 
can envisage processes in which the number of quanta which emerge finally 
as free particles is different from the number that originally collided. From 
the quantum mechanical examples we have discussed, we expect that these 
interactions will be produced by terms like g or $4, since the free — ‘harmonic’ 
— case has $?, analogous to ĝ? in the quantum mechanics example. Such 
terms arise in the solid state phonon application precisely from anharmonic 
corrections involving the atomic displacements. These terms lead to non- 
trivial phonon-phonon scattering, the treatment of which forms the basis of 
the quantum theory of thermal resistivity of insulators. In the quantum field 
theory case, when we have generalized the formalism to fermions and photons, 
the nonlinear interaction terms will produce ete~ scattering, qq annihilation 
and so on. As in the quantum mechanical case, the basic calculational method 
will be perturbation theory. 

As remarked earlier, the trouble with all these ‘real-life’ cases is that they 
involve significant complications due to spin; the corresponding fields then 
have several components, with attendant complexity in the solutions of the 
associated free-particle wave equations (Maxwell, Dirac). So in this chapter 
we shall seek to explain the essence of the perturbative approach to quantum 
field dynamics — which we take to be essentially the Feynman graph version 
of Yukawa’s exchange mechanism — in the context of simple models involving 
only scalar fields; Maxwell (vector) and Dirac (spinor) fields will be introduced 
in the following chapter. The route we follow to the ‘Feynman rules’ is the one 
first given (with remarkable clarity) by Dyson (1949a), which rapidly became 
the standard formulation. 

Before proceeding it may be worth emphasizing that in introducing a ‘non- 
harmonic’ term such as g? and thus departing from linearity in that sense, 
we are in no way affecting the basic linearity of state vector superposition in 
quantum mechanics (cf (6.11)), which continues to hold. 
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E 


6.2 Perturbation theory for interacting fields: the Dyson 
expansion of the S-matrix 


On the third day of the journey a remarkable thing happened; going into 
a sort of semi-stupor as one does after 48 hours of bus-riding, I began to 
think very hard about physics, and particularly about the rival radiation 
theories of Schwinger and Feynman. Gradually my thoughts grew more 
coherent, and before I knew where I was, I had solved the problem that 
had been in the back of my mind all this year, which was to prove the 
equivalence of the two theories. 


—From a letter from F. J. Dyson to his parents, 18 September 1948, as 
quoted in Schweber (1994), p 505. 


For definiteness, let us consider the Lagrangian 
Ê = 44,00" — 4m? ¢? — Ad® = Êra — A¢? (6.12) 


with \ > 0. Equation (6.12) is like ‘Ê = Î— V’ where V = (Vo)? + dmg? + 
Ad? is the ‘potential’. Though simple, this Lagrangian is unfortunately not 
physically sensible. The classical particle analogue potential would have the 
form V(q) = $wq? + Aq?. If we sketch V(q) as a function of q we see that, 
for small A, it retains the shape of an oscillator well near q = 0, but for q 
sufficiently large and negative it will ‘turn over’, tending ultimately to —oo as 
q — —oo. Classically we expect to be able to set up a successful perturbation 
theory for oscillations about the equilibrium position q = 0, provided that 
the amplitude of the oscillations is not so large as to carry the particle over 
the ‘lip’ of the potential; in the latter case, the particle will escape to q = 
—oo, invalidating a perturbative approach. In the quantum mechanical case 
the same potential V(q) is more problematical, since the particle can tunnel 
through the barrier separating it from the region where V — —oo. This 
means that the ground state will not be stable. An analogous disease affects 
the quantum field case — the supposed vacuum state will be unstable, and 
indeed the energy will not be positive-definite. 

Nevertheless, as the reader may already have surmised, and we shall con- 
firm later in this chapter, the ‘¢-cubed’ interaction is precisely of the form 
relevant to Yukawa’s exchange mechanism. As we have seen in the previ- 
ous section, such an interaction will typically give rise to matrix elements 
between one-quantum and two-quantum states, for example, exactly like the 
basic Yukawa emission and absorption process. In fact, all that is neces- 
sary to make the $-type interaction physical is to let it describe, not the 
‘self-coupling’ of a single field, but the ‘interactive coupling’ of at least two 
different fields. For example, we may have two scalar fields with quanta ‘A’ 
and ‘B’, and an interaction between them of the form dG? dp. This will allow 
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processes such as A + A+B. Or we may have three such fields, and an inter- 
action Aĝa sôc, allowing A 4+ B + C and similar transitions. In these cases 
the problems with the 3 self-interaction do not arise. (Incidentally those 
problems can be eliminated by the addition of a suitable higher-power term, 
for instance go.) In later sections we shall be considering the ‘ABC’ model 
specifically, but for the present it will be simpler to continue with the single 
field œ and the self-interaction \¢?, as described by the Lagrangian (6.12). 
The associated Hamiltonian is 


H= Axa +H’ (6.13) 


where (as is usual in perturbation theory) we have separated the Hamiltonian 
into a part we can handle exactly, which is the free Klein—Gordon Hamiltonian 


xa = [eve Äke = z [ee (a? + (Wd)? + m¢?] (6.14) 
and the part we shall treat perturbatively 


H! = [aon = à | bad. (6.15) 


6.2.1 The interaction picture 


We begin with a crucial formal step. In our introduction to quantum field 
theory in the previous chapter, we worked in the Heisenberg picture (HP). 
There, however, we only dealt with free (non-interacting) fields. The time 
dependence of the operators as given by the mode expansion (5.155) is that 
generated by the free KG Hamiltonian (6.14) via the Heisenberg equations 
of motion (see problem 5.8). But as soon as we include the interaction term 
Ĥ', we cannot make progress in the HP, since we do not then know the time 
dependence of the operators — which is generated by the full Hamiltonian 
H= Axe + A’. 

Instead, we might consider using the Schrödinger picture (SP) in which 
the states change with time according to 


Aly(t)) = ize) (6.16) 


and the operators are time-independent (see appendix I). Note that although 
(6.16) is a ‘Schrödinger picture’ equation, there is nothing non-relativistic 
about it: on the contrary, Å is the relevant relativistic Hamiltonian. In this 
approach, the field operators appearing in the density H are all evaluated at a 
fixed time, say t = 0 by convention, which is the time at which the Schrödinger 
and Heisenberg pictures coincide. At this fixed time, mode expansions of the 
form (5.155) with t = 0 are certainly possible, since the basis functions form 
a complete set. 

One problem with this formulation, however, is that it is not going to be 
manifestly ‘Lorentz invariant’ (or covariant), because a particular time (t = 0) 
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has been singled out. In the end, physical quantities should come out correct, 
but it is much more convenient to have everything looking nice and consistent 
with relativity as we go along. This is one of the reasons for choosing to 
work in yet a third ‘picture’, an ingenious kind of half-way-house between 
the other two, called the ‘interaction picture’ (IP). We shall see other good 
reasons shortly. 

In the HP, all the time dependence is carried by the operators and none by 
the state, while in the SP it is exactly the other way around. In the IP, both 
states and operators are time-dependent but in a way that is well adapted 
to perturbation theory, especially in quantum field theory. The operators 
have a time dependence generated by the free Hamiltonian Ho, say, and so a 
‘free-particle’ mode expansion like (5.155) survives intact (here Hy = Hxa). 
The states have a time dependence generated by the interaction H'. Thus as 
Ĥ' —> 0 we return to the free-particle HP. 

The way this works formally is as follows. In terms of the time-independent 
SP operator A (cf appendix I), we define the corresponding IP operator Ay (t) 
by . . 

Âr(t) = iot Âeiot, (6.17) 
This is just like the definition of the HP operator A(t) in appendix I, except 
that Ho appears instead of the full Ê. It follows that the time dependence of 
A(t) is given by (1.8) with H > Ap: 


on) SHA E (6.18) 


Equation (6.18) can also, of course, be derived by carefully differentiating 
(6.17). Thus — as mentioned already — the time dependence of Aj(t) is gener- 
ated by the free part of the Hamiltonian, by construction. 

As applied to our model theory (6.12), then, our field will now be spec- 
ified as being in the IP, dilz, t). What about the field canonically conjugate 
to dr(t), in the case when the interaction is included? In the HP, as long as 
the interaction does not contain time derivatives, as is the case here, the field 
canonically conjugate to the interacting field remains the same as the free-field 
case: 


(a, t) 


= OL _ alka 
dd(a,t) öls, t) 
so that we continue to adopt the equal-time commutation relation 


for the Heisenberg fields. But the IP fields are related to the HP fields by a 
unitary transformation U, as we can see by combining (6.17) with (1.7): 


= (x,t) (6.19) 


Ax(t) = e'fote-ift A (4)giHt,—iHot 


ÛÔÂMÛT! (6.21) 
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where U = e'#ote-ift and it is easy to check that UUt = ÔtÛ = Î. So taking 
equation (6.20) and pre-multiplying by U and post-multiplying by U~! on 
both sides, we obtain 


lr(æ, t), ĉi(y, t)] = id? (x — y) (6.22) 


showing that, in the interacting case, the IP fields ĝi and îy obey the free 
field commutation relation. Thus in the IP case the interacting fields obey the 
same equations of motion and the same commutation relations as the free-field 
operators. It follows that the mode expansion (5.155), and the commutation 
relations (5.158) for the mode creation and annihilation operators, can be 
taken straight over for the IP operators. 

We now turn to the states in the IP. To preserve consistency between the 
matrix elements in the Schrödinger and interaction pictures (cf the step from 
(1.6) to (I.7)) we define the corresponding IP state vector by 


IWE) = 7 a(t) (6.23) 


in terms of the SP state |7(t)). We now use (6.23) to find the equation of 
motion of |y(t))7. We have 
d A a d 
eO = er -Aa ig 
= fot — Hol(t)) + (Ho + ÊNIO) 


elMot A’ b(t)) 


= lot f'e Moti ()), (6.24) 
T 
i WwO = MOO (6.25) 
where i i 
Hy = èt H'et (6.26) 


is the interaction Hamiltonian in the interaction picture. The italicised words 
are important: they mean that all operators in H/ have the (known) free-field 
time dependence, which would not be the case for Hl’ in the HP. Thus, as 
mentioned earlier, the states in the IP have a time dependence generated by 
the interaction Hamiltonian, and this derivation has shown us that it is, in 
fact, the interaction Hamiltonian in the IP which is the appropriate generator 
of time change in this picture. 

Equation (6.25) is a slightly simplified form of the Tomonaga—Schwinger 
equation, which formed the starting point of the approach to QED followed by 
Schwinger (Schwinger 1948b, 1949a, b) and independently by Tomonaga and 
his group (Tomonaga 1946, Koba, Tati and Tomonaga 1947a, b, Kanesawa 
and Tomonaga 1948a, b, Koba and Tomonaga 1948, Koba and Takeda 1948, 
1949). 
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6.2.2 The S-matrix and the Dyson expansion 


We now start the job of applying the IP formalism to scattering and decay 
processes in quantum field theory, treated in perturbation theory; for this, 
following Dyson (1949a, b), the crucial quantity is the scattering matrix, or 
S-matrix for short, which we now introduce. A scattering process may plau- 
sibly be described in the following terms. At a time t + —oo, long before any 
interaction has occurred, we expect the effect of Ht to be negligible so that, 
from (6.25), |¢(—oo))1 will be a constant state vector |i), which is in fact an 
eigenstate of Ĥo. Thus |i) will contain a certain number of non-interacting 
particles with definite momenta, and |7(—oo)); = |i). As time evolves, the 
particles approach each other and may scatter, leading in the distant future 
(at t + oo) to another constant state |~(co)); containing non-interacting par- 
ticles. Note that |¢(oo)); will in general contain many different components, 
each with (in principle) different numbers and types of particle; these different 
components in |~(0o)); will be denoted by |f). The S-operator is now defined 
via i : 

|2b(00))1 = S|e(—00))1 = Sli). (6.27) 
A particular S-matrix element is then the amplitude for finding a particular 
final state |f) in |w(oo))1: 


(fly(o0)) = (lli) = Sa. (6.28) 
Thus we may write 
[2(c0))1 = $ If) (El (00))1 = 2 Salf). (6.29) 
f 


It is clear that it is these S-matrix elements Sp that we need to calculate, and 
the associated probabilities |Sq|?. 

Before proceeding we note an important property of S. Assuming that 
\~w(oo)); and |i) are both normalized, we have 


1 = 1(¢b(00)|W(00))1 = Gl$t fli} = (li) (6.30) 
implying that S is unitary: Sis=T. Taking matrix elements of this gives us 
the result 

SSeS = ba. (6.31) 
k 


Putting i = f in (6.31) yields $5, |Ski|? = 1, which confirms that the expansion 
coefficients in (6.29) must obey the usual condition that the sum of all the 
partial probabilities must add up to 1. Note, however, that in the present case 
the states involved may contain different numbers of particles. 

We set up a perturbation-theory approach to calculating S as follows. 
Integrating (6.25) subject to the condition at t + —oo yields 


Ww): = li) -if REYE at. (6.32) 
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This is an integral equation in which the unknown |w(t)); is buried under 
the integral on the right-hand side, rather similar to the one we encounter in 
non-relativistic scattering theory (equation (H.12) of appendix H). As in that 
case, we solve it iteratively. If Ht is neglected altogether, then the solution is 


WAO = li). (6.33) 


To get the first order in H{ correction to this, insert (6.33) in place of |y (t')}r 
on the right-hand side of (6.32) to obtain 


wor? =i + f (—iflf(t1))at |i) (6.34) 


recalling that |i) is a constant state vector. Putting this back into (6.32) yields 
|Y (t)} correct to second order in Hj: 


Iw? = fif (~i Ê! (ty)) dti 


+ I r J dt (iiih) bh (6.35) 


which is as far as we intend to go. Letting t + oo then gives us our perturbative 
series for the S-operator: 


N oo . oo tı 7 . 
S=1+ J (—iHi(t1))dtı + J dtı / dtə (—iHi (t1) (iHi (t2)) +- 
(6.36) 
with the dots indicating the higher-order terms, which are in fact summarized 
by the full formula 


29 tn-1 


§= Sin f ay f ata f dtn A (ti) Ai (tz)... (tn). (6.37) 


n=0 acs 


We could immediately start getting to work with (6.37), but there is one 
more useful technical adjustment to make. Remembering that 


Hi(t) = f Rie.) da (6.38) 
we can write the second term of (6.36) as 
/ / dindir (—i70, (21) (17, (x2) (6.39) 
t) >te 


which looks much more symmetrical in æ — t. However, there is still an awk- 
ward asymmetry between the x-integrals and the t-integrals because of the 
tı > t2 condition. The t-integrals can be converted to run from —oo to oo 
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without constraint, like the a ones, by a clever trick. Note that the ordering of 
the operators H; is significant (since they will contain non-commuting bits), 
and that it is actually given by the order of their time arguments, ‘earlier’ 
operators appearing to the right of ‘later’ ones. This feature must be pre- 
served, obviously, when we let the t-integrals run over the full infinite domain. 
We can arrange for this by introducing the time-ordering symbol T, which is 
defined by 


T(Ht(a1)Ht(x2)) = ACRL ACSI) for tı > tə 
Ai (a2)H} (z1) for tı < tg (6.40) 


and similarly for more products, and for arbitrary operators. Then (see prob- 
lem 6.1) (6.39) can be written as 


7) dtz dtz T|(~i hi (x1)) (iHi (£2))] (6.41) 


where the integrals are now unrestricted. Applying a similar analysis to the 
general term gives us the Dyson expansion of the S operator: 


(6.42) 

This fundamental formula provides the bridge leading from the Tomonaga- 

Schwinger equation (6.25) to the Feynman amplitudes (Feynman 1949a, b), 
as we shall see in detail in section 7.3.2 for the ‘ABC’ case. 


6.3 Applications to the ‘ABC’ theory 


As previously explained, the simple self-interacting 3 theory is not respectable. 
Following Griffiths (2008) we shall instead apply the foregoing covariant per- 
turbation theory to a hypothetical world consisting of three distinct types of 
scalar particles A, B and C, with masses ma, mp, mc. Each is described by 
a real scalar field which, if free, would obey the appropriate KG equation; the 
interaction term is gbadsec. We shall from now on omit the IP subscript ‘T’, 
since all operators are taken to be in the IP. Thus the Hamiltonian is 


A = Hy+H' (6.43) 


where 
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and 


H' = g | Pxdadndc = [aon (6.45) 


Each field ¢;, (i = A,B,C) has a mode expansion of the form (5.143), and 
associated creation and annihilation operators al and â; which obey the com- 
mutation relations 


[a:(k), al (k’)] = (27) (k — k')ðizy i,j = A,B,C. (6.46) 


The new feature in (6.46) is that operators associated with distinct particles 
commute. In a similar way, we also have [â;, â;] = fâ}, âi] =0. 


6.3.1 The decay C > A+B 


As our first application of (6.42), we shall calculate the decay rate (or reso- 
nance width) for the decay C + A+B, to lowest order in g. Admittedly this is 
not yet a realistic, physical, example; even so, the basic steps in the calculation 
are common to more complicated physical examples, such as W~ —> e7 + De. 

We suppose that the initial state |i) consists of one C particle with 4- 
momentum pc, and that the final state in which we are interested is that with 
one A and one B particle present, with 4-momenta pa and pg respectively. 
We want to calculate the matrix element 


Sg = (pa, ps|$|pc) (6.47) 


to lowest order in g. (Note that the ‘1’ term in (6.36) cannot contribute here 
because the initial and final states are plainly orthogonal.) This means that 
we need to evaluate the amplitude 


AY = ~ig(pa, pel I dtr ba (2)ĝe(2)ĝc(2)lpo). (6.48) 


To proceed we need to decide on the normalization of our states |p;). We will 
define (for i = A,B,C) 


Ipi) = V2Ex4} (pi)|0) (6.49) 
where E; = ym? + p?, so that (using (6.46)) 
(v;|pi) = 25: (27)* 8? (p; — p,). (6.50) 


The quantity E;6? (p; — p;) is Lorentz invariant. Note that the completeness 
relation for such states reads 


3 
Jels (6.51) 


where the ‘1’ on the right-hand side means the identity in the subspace of 
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such one-particle states, and zero for all other states. The normalization 
choice (6.49) corresponds (see comment (5) in section 5.2.5) to a wavefunction 
normalization of 2H; particles per unit volume. 

Consider now just the ¢q(a)|pc) piece of (6.48). This is 


3 
J omy E [âc (k)e** + G6 (he *] /2E cae (pc)|0) G22) 


where k = (Ep, k) and Ep = \/k? +m2,. The term with two â} ’s will give 
zero when bracketed with a final state containing no C particles. In the other 
term, we use (6.46) together with âc(k)|0} = 0 to reduce (6.52) to 


3 
/ nF aE (27)353 (pq — k)\/2Ece**|0) = e?°*|0) (6.53) 


where po = (\/pé +m, po). In exactly the same way we find that, when 
bracketed with an initial state containing no A’s or B’s, 


(pa, pBlea(x)on(a) = (Oea teire 2, (6.54) 


Hence the amplitude (6.48) becomes just 
AQP = -ig J dt pellratpe—ve)* = —ig(27)t8t(pa + pp — pc). (6-55) 


Unsurprisingly, but reassuringly, we have discovered that the amplitude van- 
ishes unless the 4-momentum is conserved via the ð-function condition: po = 
PA + pp. 

It is clear that such a transition will not occur unless mc > ma + mp 
(in the rest frame of the C, we need mo = ym} +p? + ym + p?), so let 
us assume this to be the case. We would now like to calculate the rate for 
the decay C — A+B. To do this, we shall adopt a plausible generalization 
of the ordinary procedure followed in quantum mechanical time-dependent 
perturbation theory (the reader may wish to consult section H.3 of appendix H 
at this point, to see a non-relativistic analogue). The first problem is that 
the transition probability JA 2 apparently involves the square of the four- 
dimensional 6-function. This is bad news, since (to take a simple case, and 
using (E.53)) 6(a@ — a)d(a — a) = (x — a)d(0) and 6(0) is infinite. In our 
case we have a four-fold infinity. This trouble has arisen because we have 
been using plane-wave solutions of our wave equation, and these notoriously 
lead to such problems. A proper procedure would set the whole thing up 
using wave packets, as is done, for instance, in Peskin and Schroeder (1995), 
section 4.5. An easier remedy is to adopt ‘box normalization’, in which we 
imagine that space has the finite volume V, and the interaction is turned on 
only for a time T. Then ‘(27)45+(0)’ is effectively ‘VT’ (see Weinberg (1995, 
section 3.4)). Dividing this factor out, the transition rate per unit volume is 
then 

Pa = AP P/VT = (20)454(pa + pe — po)|Mal” (6.56) 
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where (cf (6.55)) 
AQ = (2n)'54(pa + pr — po)iMa (6.57) 


so that the invariant amplitude iMg is just —ig, in this case. 

Equation (6.56) is the probability per unit time for a transition to one 
specific final state |f). But in the present case (and in all similar ones with at 
least two particles in the final state), the A + B final states form a continuum, 
and to get the total rate I we need to integrate Pa over all the continuum 
of final states, consistent with energy-momentum conservation. The corre- 
sponding differential decay rate dI is defined by dr = Pad Nz where dt is 
the number of final states, per particle, lying in a momentum space volume 
d°pad3ppz about pa and pg. For the normalization (6.49), this number is 


dpa d°pz 


os (27)32B, (27)22Ep 


(6.58) 


Finally, to get a normalization-independent quantity we must divide by the 
number of decaying particles per unit volume, which is 2Ec. Thus our final 
formula for the decay rate is 


dp, app 


On aar a O 


1 
T= pe = ECD | Pa tpe-po)|Ma 
Cc 


Note that the ‘d?p/2E’ factors are Lorentz invariant (see the exercise in ap- 
pendix E) and so are all the other terms in (6.59) except Ec, which contributes 
the correct Lorentz-transformation character for a rate (i.e. rate x 1/7). 

We now calculate the total rate [ in the rest frame of the decaying C 
particle. In this case, the 3-momentum part of the 6+ gives pa + pp = 0, so 
Pa = P = —Ppp, and the energy part becomes (E — mc) where 


E = \/m3 +P + 4/m2 +p = Ex + Ep. (6.60) 
So the total rate is 
1 ¢ d?p 
ee ey ea ee 61 
Ime (27)? f tie) en) 


Differentiating (6.60) we find 


ipl ipl |p| E 
dE = | #64 £ | dlp] = -== dlpl. 62 
( E, + Ep |p] NA d|p| (6.62) 


Thus we may write 


ELE 
d’p = 4r|p|? d|p| = 47o- = aE (6.63) 
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and use the energy 6-function in (6.61) to do the dE integral yielding finally 
2 
pa u (6.64) 


The quantity |p| is actually determined from (6.60) now with E = mc; after 
some algebra, we find (problem 5.2) 


|p| = [m4 + m4 + mé — 2m3 m2, — 2m32m?2, — 2m2,m4]1/?/2mc. (6.65) 


Equation (6.64) is the result of an ‘almost real life’ calculation and a num- 
ber of comments are in order. First, consider the question of dimensions. In 
our units A = c = 1, T as an inverse time should have the dimensions of a 
mass (see appendix B), which can also be understood if we think of T as the 
width of an unstable resonance state. This requires ‘g’ to have the dimensions 
of a mass, i.e. g ~ M in these units. Going back to our Hamiltonian (6.44) 
and (6.45), which must also have dimensions of a mass, we see from (6.44) 
that the scalar fields 6; ~ M (using dæ ~ M~3), and hence from (6.45) 
g ~ M as required. It turns out that the dimensionality of the coupling con- 
stants (such as g) is of great significance in quantum field theory. In QED, 
the analogous quantity is the charge e, and this is dimensionless in our units 
(a = e? /4r = 1/137, see appendix C). However, we saw in (1.31) that Fermi’s 
‘four-fermion’ coupling constant G had dimensions ~ M~?, while Yukawa’s 
‘gn’ and ‘g’’ (see figure 1.4) were both dimensionless. In fact, as we shall 
explain in section 11.8, the dimensionality of a theory’s coupling constant is 
an important guide as to whether the infinities generally present in the theory 
can be controlled by renormalization (see chapter 10) or not: in particular, 
theories in which the coupling constant has negative mass dimensions, such as 
the ‘four-fermion’ theory, are not renormalizable. Theories with dimension- 
less coupling constants, such as QED, are generally renormalizable, though 
not invariably so. Theories whose coupling constants have positive mass di- 
mension, as in the ABC model, are ‘super-renormalizable’, meaning (roughly) 
that they have fewer basic divergences than ordinary renormalizable theories 
(see section 11.8). 

In the present case, let us say that the mass of the decaying particle mc, 
‘sets the scale’ for g, so that we write g = gmc and then 


g? 
r= 2p (6.66) 
T 


where g is dimensionless. Equation (6.66) shows us nicely that T is simply 
proportional to the energy release in the decay, as determined by |p| (one often 
says that I is determined ‘by the available phase space’). If mc is exactly 
equal to ma + mp, then |p| vanishes and so does I. At the opposite extreme, 
if ma and mp are negligible compared to mc , we would have 


s2 
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Equation (6.67) shows that, even if 7/167 is small (~ 1/137 say) T can still 
be surprisingly large if mc is, as in W~ > e + De for example. 


6.3.2 A+B—->A+4+B scattering: the amplitudes 


We now consider the two-particle — two-particle process 
A+B—-A+B (6.68) 


in which the initial 4-momenta are pa, pg and the final 4-momenta are p% , 
pp so that pa + pp = ph +p. Our main task is to calculate the matrix 
element (ph, Pa|S |pa, pg) to lowest non-trivial order in g. The result will 
be the derivation of our first ‘Feynman rules’ for amplitudes in perturbative 
quantum field theory. 

The first term in the $-operator expansion (6.42) is ‘1’, which does not 
involve g at all. Nevertheless, it is a useful exercise to evaluate and understand 
this contribution (which in the present case does not vanish), namely 


(O|@a (Py )an (Pp) 44 (pa Jâ} (pB)|0) (166a EB Eh Ep)”. (6.69) 


We shall have to evaluate many such vacuum expectation values (vev) of prod- 
ucts of a'’s and @’s. The general strategy is to commute the ât’s to the left, 
and the @’s to the right, and then make use of the facts 


(ola! = @;|0) = 0 (6.70) 


for any i = A,B,C. Thus, remembering that all ‘A’ operators commute with 
all ‘B’ ones, the vev in (6.69) is equal to 


(laa (ph )â$ (pa) {(277)°53 (py — ph) + âf (pp )an (pf) }10) 
= (0|{(27)35 (pa — p'a) + ah (pa)âa (ph )}(21)38 (Pp — p's)|0) 


= (2)°6" (pa — p,)(2m)°d" (Dp — pp). (6.71) 
The 6-functions enforce Ea = Ey and Ep = Eh so that (6.69) becomes 
2Ea(21)*0 (pa — pa)2EB(21)*0 (pp — pp), (6.72) 


a result which just expresses the normalization of the states, and the fact 
that, with no ‘g’ entering, the particles have not interacted at all, but have 
continued on their separate ways, quite unperturbed (pa = Ph, Pp = Ph). 
This contribution can be represented diagrammatically as figure 6.1. 

Next, consider the term of order g, which we used in C > A + B. This is 


~ig J dz (ph, pla (0) 4p (1)ĝo(1)lpa, pe). (6.73) 


We have to remember, now, that all the ĝi operators are in the interaction 
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FIGURE 6.1 
The order g? term in the perturbative expansion: the two particles do not 
interact. 


picture and are therefore represented by standard mode expansions involving 
the free creation and annihilation operators al and ĉâ;, i.e. the same ones used 
in defining the initial and final state vectors. It is then obvious that (6.73) 
must vanish, since no C-particle exists in either the initial or final state, and 
(0|c|0) = 0. 

So we move on to the term of order g?, which will provide the real meat 
of this chapter. This term is 


CHL | [ ater ats (Olan (oh )an(r) 
x T{ba(21)bn(01)b0(#1)ba (x2) $n (22)G0(x2)} 
x â (pa)ak(pp)|0) (16E 4 Eg Ei Eh)? (6.74) 


The vev here involves the product of ten operators, so it will pay us to pause 
and think how such things may be efficiently evaluated. 
Consider the case of just four operators 


(0| ABC D|0) (6.75) 


where each of A, B : C ; D is an Ĝi, an al or a linear combination of these. Let 


A have the generic form A = â + ât. Then (using (O|at = a|0) = 0) 


(O|ABCD|O) = (0|aBCD|O) 
(0|[a, BCD]|0). (6.76) 
Now it is an algebraic identity that 
lâ, BCD] = fâ, BICD + Bla, C|D + BC{a, Ô). (6.77) 


Hence 
(0|ABCDI0) = [a, B](O|CD|0) + [a, C}(0| BDO) + [a, D\(O|BC|0), (6.78) 


remembering that all the commutators — if non-vanishing — are just ordinary 
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numbers (see (6.46)). We can rewrite (6.78) in more suggestive form by noting 
that 
[â, B] = (0|[å, B]]0) = (0|4B|0) = (0| ABO). (6.79) 


Thus the vev of a product of four operators is just the sum of the products 
of all the possible pairwise ‘contractions’ (the name given to the vev of the 
product of two fields): 


(0|ABCD|0) = (0|AB|0)(0|CD|0) + (0| AC|0) (0| BDO) + (0| AD|0) (0|BC|0). 

(6.80) 
This result generalizes to the vev of the product of any number of operators; 
there is also a similar result for the vev of time-ordered products of operators, 
which is known as Wick’s theorem (Wick 1950), and is indispensable for a 
general discussion of quantum field perturbation theory. 

Consider then the application of (6.80), as generalized to ten operators, 
to the vev in (6.74). The only kind of non-vanishing contractions are of the 
form (0|4;41|0). Thus the contractions of A-, B- and C-type operators can be 
considered separately. As far as the C-operators are concerned, then, we can 
immediately conclude that the only surviving contraction is 


(0|T'(dc(x1)dc(x2))|0). (6.81) 


This quantity is, in fact, of fundamental importance: it is called the Feynman 
propagator (in coordinate space) for the spin-0 C-particle. We shall derive 
the mathematical formula for it in due course, but for the moment let us 
understand its physical significance. Each of the éc’s in (6.81) can create 
or destroy C-quanta, but for the vev to be non-zero anything created in the 
‘initial’ state must be destroyed in the ‘final’ one. Which of the times tı and 
tg is initial or final is determined by the T-ordering symbol: for tı > t2, a C- 
quantum is created at x2 and destroyed at xı, while for tı < t2 a C-quantum 
is created at xı and destroyed at x2. Thus the amplitude (6.81) may be 
represented pictorially as in figure 6.2, where time increases to the right, and 
the vertical axis is a one-dimensional version of three-dimensional space. It 
seems reasonable, indeed, to call this object the ‘propagator’, since it clearly 
has to do with a quantum propagating between two space-time points. 

We might now worry that this explicit time-ordering seems to introduce a 
Lorentz non-invariant element into the calculation, ultimately threatening the 
Lorentz invariance of the $-operator (6.42). The reason that this is in fact not 
the case exposes an important property of quantum field theory. If the two 
points xı and x2 are separated by a time-like interval (i.e. (xı — x2)? > 0), 
then the time-ordering is Lorentz invariant; this is because no proper Lorentz 
transformation can alter the time-ordering of time-like separated events (here, 
the events are the creation/annihilation of particles/antiparticles at x, and 
x2). By ‘proper’ is meant a transformation that does not reverse the sense of 
time; the behaviour of the theory under time-reversal is a different question 
altogether, discussed earlier in section 4.2.4. The fact that time-ordering is 
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(a) t, >t, (b) t, <t, 
FIGURE 6.2 


C-quantum propagating (a) for tı > t2 (from x2 to xı) and (b) tı < t2 (from 
wi to x2). 


invariant for time-like separated events is what guarantees that we cannot 
influence our past, only our future. But what if the events are space-like 
separated, (xı — x2)? <0? We know that the scalar fields ġ;(x1) and ¢;(x2) 
commute for equal times: remarkably, one can show (problem 5.6(b)) that 
they also commute for (x; — x2)? < 0; so in this sector of 21 — £2 space 
the time-ordering symbol is irrelevant. Thus, contrary to appearances, the 
T-product vev is Lorentz invariant. For the same reason, the Ŝŝ operator of 
(6.42) is also Lorentz invariant: see, for example, Weinberg (1995, section 3.5). 
The property 


[di(w1), ĝi(£2)] =0 for (zı — z2) <0 (6.82) 


has an important physical interpretation. In quantum mechanics, if operators 
representing physical observables commute with each other, then measure- 
ments of either observable can be performed without interfering with each 
other; the observables are said to be ‘compatible’. This is just what we would 
want for measurements done at two points which are space-like separated — 
no signal with speed less than or equal to light can connect them, and so we 
would expect them to be non-interfering. Condition (6.82) is often called a 
‘causality’ condition. 

More mathematically, the amplitude (6.81) is in fact a Green function for 
the KG operator (O + m2)! (see appendix G, and problem 6.3). That is to 
say, 


(Gz, + me) (O|T(dc(#1)oc(w2))|0) = —id* (a1 — x2). (6.83) 
Actually, problem 6.3 shows that (6.83) is true even when the (0| and |0) 
are removed, i.e. the operator quantity T(¢0(21)¢c(22)) is itself a KG Green 
function. The work of appendices G and H indicates the central importance 
of such Green functions in scattering theory, so we need not be surprised to 
find such a thing appearing here. 
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Now let us figure out what are all the surviving terms in the vev in (6.74). 
As far as contractions involving Ga(p/,) are concerned, we have only three 
non-zero possibilities: 


(Olâa (pa )âi (pa)l0) — (Ol@a(P')da(w1)/0) (Ola (Wy) Ga(wa)0). (6.84) 


There are similar possibilities for al (pa), âB (ph) and al, (pp). The upshot is 
that we have only the following pairings to consider: 
(0|4a (p'a )âh (pa)l0) 


(0|âs (p's) 4}, (pB)10) 
x (OT (da (a1) ba (z 


2))10) (0|T (x (#1) $x (#2))|0) (OIL ($c (x1)$c (z2))10); 


(O|@a (p'a Jâ (pa)|0) Olan (Pp) dn (21)10) 

(Olds (@2) ah (pB)|0) (0[T (dc(#1)dc(w2))|0) (O|T ($a (a1) Ga (w2))I0) 

+ zı © 29} (6.86) 
(0|@8 (pp )â} (pB)|0) (Ola (p'a )ĝa (21)10) 

x (Olds (x2)â) (pa)|0) (O/T ($c (x1)ĝc(z2))10) (0|T ($s (#1) bn (x2))10) 

+ z1 © T2; (6.87) 
(O|@a (ph )ĝa (21)10) (0lĝa (a2) a4 A(P )|0) (0p (pg) bn (221 )|0) 

x (OlĝB(x2)â} (pB)|0) (0|T (dc(#1)dc(w2))|0) 

+ x1 © 29; (6.88) 
(O|@a (p'a )Ga (a1) |0) (O|ba (x2) a4 Alp a )[0)(0]âs (pg) bn (£2)10) 

x (OlĝB (z1 )â} (pB)|0) (OIT ($c (x1)ĝc(22))10) 

+21 6 Zo. (6.89) 


We already know that quantities like (0|a(p'x )@', (pa) |0) yield something 
proportional to (pa — p'a) and correspond to the initial A-particle going 
‘straight through’. The other factors in (6.85) which are new are quantities 
like (O|@a (p',)a(a1)|0), which has the value (problem 6.4) 


(tawh ba (1) 10) = aa (6.90) 


which is proportional (depending on the adopted normalization) to the wave- 
function for an outgoing A-particle with 4-momentum p’. 

We are now in a position to give a diagrammatic interpretation of all 
of (6.85)—(6.89). In these diagrams, we shall not (as we did in figure 6.2) 
draw two separately time-ordered pieces for each propagator. We shall not 
indicate the time-ordering at all and we shall understand that both time- 
orderings are always included in each propagator line. Term (6.85) then has 
the structure shown in figure 6.3(a); term (6.86) that shown in figure 6.3(b); 
term (6.87) that in figure 6.3(c); term (6.88) that in figure 6.3(d); and term 
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Graphical representation of (6.85)-(6.89): (a) (6.85); (b) (6.86); (c) (6.87); 
d) (6.88); (e) (6.89). 


6.89) that in figure 6.3(e). We recognize in figure 6.3(e) the long-awaited 
Yukawa exchange process, which we shall shortly analyse in full — but the 
formalism has yielded much else besides! We shall come back to figures 6.3(a), 
b) and (c) in section 6.3.5; for the moment we note that these processes do 
not represent true interactions between the particles, since at least one goes 
through unscattered in each case. So we shall concentrate on figures 6.3(d) 
and (e), and derive the Feynman rules for them. 

First, consider figure 6.3(e), corresponding to the contraction (6.89). When 
this is inserted into (6.74), the two terms in which x; and 22 are interchanged 
give identical results (interchanging xı and zə in the integral), so the contri- 
bution we are discussing is 


(ig)? f d4xyd4agel(Pa—Ps) #1 gi(Pb—Pa)-*2 (O/T(do(w1)de(w2))|0)- (6.91) 


We must now turn our attention, as promised, to the propagator of (6.81), 
(0|T (dc(#1)dc(xv2))|0). Inserting the mode expansion (6.52) for each of dc(x1) 


and ¢c¢(#2), and using the commutation relations (6.46) and the vacuum con- 
ditions (6.70) we find (problem 6.5) 


‘ ‘ d?k —iwp (t1 tz) +ik-(#1-@ 
(OT (cla)bc(ea))0) =f apr t -heatene 
+ Ofta — tye Welte—t) 42-21) (6,92) 


where wp = (k? + m2,)!/?. This expression is very ‘uncovariant looking’, 
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due to the presence of the 6-functions with time arguments. But the ear- 
lier discussion, after (6.81), has assured us that the left-hand side of (6.92) 
must be Lorentz invariant, and — by a clever trick — it is possible to recast 
the right-hand side in manifestly invariant form. We introduce an integral 
representation of the 0-function via 


olt) s e (6.93) 


oo 2T z + ie 


where e is an infinitesimally small positive quantity (see appendix F). Multi- 
plying (6.93) by e~** and changing z to z + wp in the integral we have 
œ dz gii 


O(t)e et = i < —. 
uje ! -o 2T z — (wp — ie) 


(6.94) 


Putting (6.94) into (6.92) then yields 


r 5 d3kd —iz(tı—t2)+ik-(Œ£1-£2) 
(0|T(éc(a1)éc(#2))|0) = i f — 


eiz(tızt2)-ik-(£1-£2) \ 


z — (wp — ie) 


: (6.95) 

z — (wy — ie) 
The exponentials and the volume element demand a more symmetrical nota- 
tion: let us write ko = z so that (ko = z, k) form the components of a 4-vector 
kt. Note very carefully, however, that ko is not (k? +m2,)1/?! The variable ko 
is unrestricted, whereas it is wp that equals (k? + m2,)'/?. With this change 
of notation, (6.95) becomes 


d‘k i eT ik: (11-22) eik (21-22) 
Cr)? dup { \ 


T(¢ h = pene ee 
(0|7 c(e1)éc(e2))\0) = f eo toe 

(6.96) 
Changing k + —k (ko > —ko, k — —k) in the second term in (6.96), we 


finally have 


(0|T ($c(z1)ĝc(z2))|0) 
=f OO rna 1 1 
= | om le ee} 
= dtk —ik-(xı1— z2) i 
: J sar ETT (6.97) 


or 


(6.98) 


k2 — k? — m2, + ie 


1We know that the left-hand side of (6.95) is Lorentz invariant, and that (tı — t2, %1 — 
x2) form the components of a 4-vector. The quantities (ko = z,k) must also form the 
components of a 4-vector, in order for the exponentials in (6.95) to be invariant. 


170 6. Quantum Field Theory II: Interacting Scalar Fields 


where in the last step we have used w? = k? + m2, and written ‘ie’ for ‘2iew,’ 
since what matters is just the sign of the small imaginary part (note that wp is 
defined as the positive square root). In this final form, the Lorentz invariance 
of the scalar propagator is indeed manifest. 

We shall have more to say about this propagator (Green function) in sec- 
tion 6.3.3. For the moment we simply note two points: first, it is the Fourier 
transform of i/k? — m2, + ie, as stated in appendix G, where k? = kê — k?; 
and second, it is a function of the coordinate difference xı — £2, as it has to 
be since we do not expect physics to depend on the choice of origin. This 
second point gives us a clue as to how best to perform the x; — x2 integral 
in (6.91). Let us introduce the new variables x = zı — £2, X = (a1 + £2)/2. 
Then (problem 6.6) (6.91) reduces to 
d*+k 


i 


_ta\2 4 s4 a ee i 4 iq-x —ik-x 
(—ig)*(2n)"54(0a + Do ~~ Pa) | tett f Se aE 
(6.99) 
= (ig)? (2m)*64(pa + pe — ph — Ps) — (6.100) 


q? — me + ie 


where q = pa — Pp = P's — Pp is the 4-momentum transfer carried by the 
exchanged C-quantum in figure 6.4, and we have used the four-dimensional 
version of (E.26). We associate this single expression, which includes the 
two coordinate space processes of figure 6.2, with the single momentum—space 
Feynman diagram of figure 6.4. The arrows refer merely to the flow of 4- 
momentum, which is conserved at each ‘vertex’ (i.e. meeting of three lines). 
Thus although the arrow on the exchanged C-line is drawn as indicated, this 
has nothing to do with any presumed order of emission/absorption of the 
exchanged quantum. It cannot do so, after all, since in this diagram the states 
all have definite 4-momentum and hence are totally delocalized in space-time; 
equivalently, we recall from (6.91) that the amplitude in fact involves integrals 
over all space-time. 

A similar analysis (problem 6.7) shows that the contribution of the con- 
tractions (6.88) to the S-matrix element (6.74) is 


i 


—oOoOoww 6.101 
Ae Sere OU 


(—ig)?(2m)*6*(pa + DB — P's — Pp) 
which is represented by the momentum-space Feynman diagram of figure 6.5. 
At this point we may start to write down the Feynman rules for the ABC 
theory, which enable us to associate a precise mathematical expression for an 
amplitude with a Feynman diagram such as figure 6.4 or figure 6.5. It is clear 
that we will always have a factor (27)*6+(pa +pp—p's — ph) for all ‘connected’ 
diagrams, following from the flow of the conserved 4-momentum through the 
diagrams. It is conventional to extract this factor, and to define the invariant 
amplitude Mg via 
Sg = 6g + i(2m)*64 (ps as pi)Mg (6.102) 
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FIGURE 6.4 
Momentum-space Feynman diagram corresponding to the O(g?) amplitude of 
(6.100). 


Pa 


FIGURE 6.5 
Momentum-space Feynman diagram corresponding to the O(g?) amplitude of 
(6.101). 
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in general (cf (6.57)). The rules reconstruct the invariant amplitude iMg 
corresponding to a given diagram, and for the present case they are: 


(i) At each vertex, a factor —ig. 


(ii) For each internal line, a factor 


(6.103) 
q? — m? + ie f 
where i = A,B or C and q; is the 4-momentum carried by that line. 
The factor (6.103) is the Feynman propagator in momentum space, 
for the scalar particle ‘2’. 


Of course, it is no big deal to give a set of rules which will just reconstruct 
(6.100) and (6.101). The real power of the ‘rules’ is that they work for all 
diagrams we can draw by joining together vertices and propagators (except 
that we have not yet explained what to do if more than one particle appears 
‘internally’ between two vertices, as in figures 6.3(a)-(c): see section 6.3.5). 


6.3.33 A+B—>A+B scattering: the Yukawa exchange 
mechanism, s and u channel processes 


Referring back to section 1.3.3, equation (1.28), we see that the amplitude for 
the exchange process of figure 6.4 indeed has the form suggested there, namely 
~ g?/(¢? — md) if C is exchanged. We have seen how, in the static limit, this 
may be interpreted as a Yukawa interaction of range i/mcc between the par- 
ticles A and B, treated in the Born approximation. Expression (6.100), then, 
provides us with the correct relativistic formula for this Yukawa mechanism. 

There is more to be said about this fundamental amplitude (6.100), which 
is essentially the C propagator in momentum space. While it is always true 
that p? = m? for a free particle of 4-momentum p; and rest mass mj, it is 
not the case that q? = mê in (6.100). We emphasized after (6.95) that the 
variable ko introduced there was not equal to (k? + m2)! 2 and the result 
of the step (6.99) to (6.100) was to replace ko by qo and k by q, so that 
qo # (q? + m2,)'/?, i.e. q? = q2 — q? # m2. So the exchanged quantum in 
figure 6.4 does not satisfy the ‘mass-shell condition’ p? = m?; it is said to be 
‘off-mass shell’ or ‘virtual’ (see also problem 6.8). It is quite a different entity 
from a free quantum. Indeed, as we saw in more elementary physical terms 
in section 1.3.2, it has a fleeting existence, as sanctioned by the uncertainty 
relation. 

It is convenient, at this point, to introduce some kinematic variables which 
will appear often in following chapters. These are the ‘Mandelstam variables’ 
(Mandelstam 1958, 1959) 


s= (pa +psBe)}? t=(pa—pa)? u= (pa -— pp). (6.104) 


They are clearly relativistically invariant. In terms of these variables the 
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FIGURE 6.6 
O(e?) contribution to ete~ — ete” via annihilation to (and re-emission from) 
a virtual y state. 


amplitude (6.100) is essentially ~ 1/(u— m2, + ie), and the amplitude (6.101) 
is ~ 1/(s — mé + ie). The first is said to be a ‘u-channel process’, the second 
an ‘s-channel process’. Amplitudes of the form (t — m?)~1 or (u — m?)7} 
are basically one-quantum exchange (i.e. ‘force’) processes, while those of the 
form (s — m2,)~' have a rather different interpretation, as we now discuss. 

Let us first ask: can s = (pa + pp)? ever equal m2, in (6.101)? Since s is 
invariant, we can evaluate it in any frame we like, for example the centre-of- 
momentum (CM) frame in which 


(pa + pB)? = (Ea + Ep)? (6.105) 


with Ea = (m2 + p*)'/?, Eg = (m2, + p*)'/?. It is then clear that if mo < 
ma+mep the condition (pa+pp)”? = mê can never be satisfied, and the internal 
quantum in figure 6.5 is always virtual (note that pa + pp is the 4-momentum 
of the C-quantum). Depending on the details of the theory with which we 
are dealing, such an s-channel process can have different interpretations. In 
QED, for example, in the process et +e~ — et +e7 we could have a virtual y 
s-channel process as shown in figure 6.6. This would be called an ‘annihilation 
process’ for obvious reasons. In the process y+e— — y+e7, however, we could 
have figure 6.7, which would be interpreted as an absorption and re-emission 
process (i.e. of a photon). 

However, if mc > ma + mp, then we can indeed satisfy (pa + pp)? = md, 
and so (remembering that € is infinitesimal) we seem to have an infinite result 
when s (the square of the CM energy) hits the value m2). In fact, this is not the 
case. If mc > ma + mp, the C-particle is unstable against decay to A+B, as 
we saw in section 6.3.1. The s-channel process must then be interpreted as the 
formation of a resonance, i.e. of the transitory and decaying state consisting 
of the single C-particle. Such a process would be described non-relativistically 
by a Breit-Wigner amplitude of the form 


M « 1/(E — Ep + ir /2) (6.106) 


which produces a peak in |M|? centred at E = Ep and full width T at half- 
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FIGURE 6.7 
O(e”) contribution to ye~ — ye~ via absorption to (and re-emission from) a 
virtual e~ state. 


height; T is, in fact, precisely the width calculated in section 6.3.1. The 
relativistic generalization of (6.106) is 


1 

M «x ——.—_—_ 6.107 
M? +iMT ne 
where M is the mass of the unstable particle. Thus in the present case the 
prescription for avoiding the infinity in our amplitude is to replace the in- 
finitesimal ‘ie’ in (6.101) by the finite quantity imcT, with I as calculated 
in section 6.3.1. We shall see examples of such s-channel resonances in sec- 

tion 9.5. 


6.3.4 A+B — A + B scattering: the differential 
cross section 


We complete this exercise in the ‘ABC’ theory by showing how to calculate the 
cross section for A+B— A+B scattering in terms of the invariant amplitude 
Mg of (6.102). The discussion will closely parallel the calculation of the decay 
rate I in section 6.3.1. 

As in (6.56), the transition rate per unit volume, in this case, is 


Px = (20)*5* (pa + pp — ph — ph) Male. (6.108) 


In order to obtain a quantity which may be compared from experiment to 
experiment, we must remove the dependence of the transition rate on the 
incident flux of particles and on the number of target particles per unit volume. 
Now the flux of beam particles (‘A’ ones, let us say) incident on a stationary 
target is just the number of particles per unit area reaching the target in unit 
time which, with our normalization of ‘2E particles per unit volume’, is just 


lv|2Ea (6.109) 


where v is the velocity of the incident A in the rest frame of the target B. 
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The number of target particles per unit volume is 2Eg (= 2mp for B at rest, 
of course). 

We must also include the ‘density of final states’ factors, as in (6.59). 
Putting all this together, the total cross section o is given in terms of the 
differential cross section dø by 


1 
fæ = mae | "oa + pg — ph — pp) 
dp, dp 
(2n)22E, (an) 22E4 


1 2 
= Å dLi -p 4 6.110 
AR, E lv] fima ips(s; PA, Pg), ( ) 


Q 
ll 


x (Mg? 


where we have introduced the Lorentz invariant phase space dLips(s; py, pp) 
defined by 


, jPA dp, 


6.111 
EEG (6.111) 


1 
dLips(s; ph, ph) = — ô (pa + PB — PA — Ph 
(47)? 


4T 


We can write the flux factor for collinear collisions in invariant form using the 
relation (easily verified in a particular frame (problem 6.9)) 


EĘ Egļu] = [(pa - pB)? — mimg]. (6.112) 


Everything in (6.110) is now written in invariant form. 
It is a useful exercise to evaluate f do in a given frame, and the simplest 
one is the centre-of-momentum (CM) frame defined by 


Pa +Pp = Ph +pPh =O. (6.113) 


However, before specializing to this frame, it is convenient to simplify our 
expression for dLips. Using the 3-momentum part of the d-function in (6.110), 
we can eliminate the integral over d3p: 


dèpi 1 
A2 ôt (pa + pp — pa — PB) = GrO(Ea + En — Eh — Ep), (6.114) 
B B 


remembering also that now ph has to be replaced by py +Pg — p’ in Mg. On 
the right-hand side of (6.114), pp and Ep are no longer independent variables 
but are determined by the conditions 


Ph =Pa +Pgr-pPh Ee = (m +p). (6.115) 
Next, convert dp’, to angular variables 


dp) = px d|p | dQ. (6.116) 
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The energy F% is given by 
Eh = (må + py)? (6.117) 


so that 
E, dE; = |p4l d|p]. (6.118) 


With all these changes we arrive at the result (valid in any frame) 


1 dE’, 
dLips(s; ph, ph) = PaldEs ao (Ea + Eg — Eh — Ep). (6.119) 
(4r)? Bh 
We now specialize to the CM frame for which pa = p = —Pg, Ph = P' = 
—ph, and 
E, =(m,+p?)/? Ep = (mb +p”)? (6.120) 
so that 
E dE, = |p'|d|p'| = Eg dEp. (6.121) 
Introduce the variable W’ = E + Eh (note that W” is only constrained 


to equal the total energy W = Ea + Eg after the integral over the energy- 
conserving ĝ-function has been performed). Then (as in (6.62)) 


W'|p'|dip'| wW 


dW’ = dE, +dEk = 
Ak EE, Ek 


dE% (6.122) 


where we have used (6.121) in each of the last two steps. Thus the factor 


dE‘, 
al Ei A§(E, + Ep — Eh — Ep) (6.123) 
becomes 
-5(W = w’) (6.124) 
which reduces to 
|pl/W 


after integrating over W’, since the energy-conservation relation forces |p’| = 
|p|. We arrive at the important result 


oe ii |p| 
dLips(s; p'a, Pp) = aW (6.125) 
for the two-body phase space in the CM frame. 

The last piece in the puzzle is the evaluation of the flux factor (6.112) in 
the CM frame. In the CM we have 


pa:pe = (Ea, p): (Es, —p) (6.126) 
E, Ep + p” (6.127) 
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and a straightforward calculation shows that 
(pa : pB)? — mimg = p?W?. 


Hence we finally have 


1 1 a | 2 
= Q 12 
o [eo pW ( (in) anew |Mg|" d (6.128) 
and the CM differential cross section is 
do 
— Mal? 6.129 
dO fem SF nw Mal oe 


6.3.5 A+B — A + B scattering: loose ends 


We must now return to the amplitudes represented by figures 6.3(a)—(c), 
which we set aside earlier. Consider first figure 6.3(b). Here the A-particle has 
continued through without interacting, while the B-particle has made a virtual 
transition to the ‘A + C’ state, and then this state has reverted to the original 
B-state. So this is in the nature of a correction to the ‘no-scattering’ piece 
shown in figure 6.1, and does not contribute to Mg. However, such a virtual 
transition B+ A + C —> B does represent a modification of the properties of 
the original single B state, due to its interactions with other fields as specified 
in Hj. We can easily imagine how, at order gf, an amplitude will occur in 
which such a virtual process is inserted into the C propagator in figure 6.4 so 
as to arrive at figure 6.8, from which it is plausible that such emission and 
reabsorption processes by the same particle effectively modify the propagator 
for this particle. This, in turn, suggests that part, at least, of their effect will 
be to modify the mass of the affected particle, so as to change it from the 
original value specified in the Lagrangian. We may think of this physically 
as being associated, in some way, with a particle’s carrying with it a ‘cloud’ 
of virtual particles, with which it is continually interacting; this will affect its 
mass, much as the mass of an electron in a solid becomes an ‘effective’ mass 
due to the various interactions experienced by the electron inside the solid. 

We shall postpone the evaluation of amplitudes such as those represented 
by figures 6.3(b) and (c) to chapter 10. However, we note here just one feature: 
4-momentum conservation applied at each vertex in figure 6.3(b) does not 
determine the individual 4-momenta of the intermediate A and C particles, 
only the sum of their 4-momenta, which is equal to pp (and this is equal to ph 
also, so indeed no scattering has occurred). It is plausible that, if an internal 
4-momentum in a diagram is undetermined in terms of the external (fixed) 4- 
momenta of the physical process, then that undetermined 4-momentum should 
be integrated over. This is the case, as can be verified straightforwardly by 
evaluating the amplitude (6.86), for example, as we evaluated (6.89); a similar 
calculation will be gone through in detail in chapter 10, section 10.1.1. The 
corresponding Feynman rule is 
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FIGURE 6.8 
O(g*) contribution to the process A + B — A + B, in which a virtual transi- 
tion C + A+B > C occurs in the C propagator. 


(iii) For each internal 4-momentum k which is not fixed by 4-momentum 
conservation, carry out the integration f d*k/(2m)*. One such in- 
tegration with respect to an internal 4-momentum occurs for each 
closed loop. 


If we apply this new rule to figure 6.3(b), we find that we need to evaluate 
the integral 


4 
/ re (6.130) 
(2m)* (k? — må) (Ps — k)? — me) 

which, by simple counting of powers of k in numerator and denominator, is 
logarithmically divergent. Thus we learn that, almost before we have started 
quantum field theory in earnest, we seem to have run into a serious problem, 
which is going to affect all higher-order processes containing loops. The pro- 
cedure whereby these infinities are tamed is called renormalization, and we 
shall return to it in chapter 10. 

Finally, what about figure 6.3(a)? In this case nothing at all has occurred 
to either of the scattering particles, and instead a virtual trio of A + B + C has 
appeared from the vacuum, and then disappeared back again. Such processes 
are called, obviously enough, vacuum diagrams. This particular one is in 
fact only (another) correction to figure 6.1, and it makes no contribution to 
Ms. But as with figure 6.8, at O(g*) we can imagine such a vacuum process 
appearing ‘alongside’ figure 6.4 or figure 6.5, as in figures 6.9(a) and (0). 
These are called ‘disconnected diagrams’ and — since in them A and B have 
certainly interacted — they will contribute to Mg (note that they are in this 
respect quite different from the ‘straight through’ diagrams of figures 6.3(b) 
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(a) (b) 


FIGURE 6.9 
O(g*) disconnected diagrams in A+ B —> A +B. 


and (c)). However, it turns out, rather remarkably, that their effect is exactly 
compensated by another effect we have glossed over — namely the fact that the 
vacuum |0) we have used in our S-matrix elements is plainly the unperturbed 
vacuum (or ground state), whereas surely the introduction of interactions will 
perturb it. A careful analysis of this (Peskin and Schroeder 1995, section 7.2) 
shows that Mg is to be calculated from only the connected Feynman diagrams. 

In this chapter we have seen how the Feynman rules for scattering and 
decay amplitudes in a simple scalar theory are derived, and also how cross 
sections and decay rates are calculated. A Yukawa (u-channel) exchange 
process has been found, in its covariant form, and the analogous s-channel 
process, together with a hint of the complications which arise when loops are 
considered, at higher order in g. Unfortunately, however, none of this applies 
directly to any real physical process, since we do not know of any physical 
‘scalar ABC’ interaction. Rather, the interactions in the Standard Model are 
all gauge interactions similar to electrodynamics (with the exception of the 
Higgs sector, which has both cubic and quartic scalar interactions). The me- 
diating quanta of these gauge interactions have spin-1, not zero; furthermore, 
the matter fields (again apart from the Higgs field) have spin-4. It is time to 
begin discussing the complications of spin and the particular form of dynamics 
associated with the ‘gauge principle’. 


Problems 


6.1 Show that, for a quantum field f (t) (suppressing the space coordinates), 
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where 
T(f(ti)f(t2)) = f(t)f(t2) forti > te 
Ata t) for ty > th. 
6.2 Verify equation (6.65). 
6.3 Let ¢(a,t) be a real scalar KG field in one space dimension, satisfying 


(A, + m?)9(a,t) = (= ce | m?) d(a,t) =0. 


(a) Explain why 


T(d(a1, t1)O(w2,t2)) = Olt — te)b(x1, t1)d(x2, t2) 


+ O(t2 — t1)b(a2, t2) (#1, t1) 
(see equation (E.47) for a definition of the 0-function). 
(b) Using equation (E.46), show that 


d 
qe —a) = ô(xz — a). 


(c) Using the result of (b) with appropriate changes of variable, and 
equation (5.118), show that 


Ə 7 * 
BEE Pen ti) d(x, t2))} 


= O(ty = ta) b(a1, t1)$(x2, t2) + O(t2 = t1)O(x2, t2)d(a1, ty). 


(d) Using (5.117) and (5.122) show that 


EAT Gler t1)O(e2,ta))} = -ðe —)8lt1 t2) +T le, tilez: t) 


and hence show that 


82 8? m x 
(Z Eo +m?) T(9(21, t1) (2, t2)) = —ið (a1 —x2)ð(tı —t2). 
1 1 


,t1)b(x2,t2)) is a Green function (see ap- 


This shows that T(6(a1 
(G.25) — the i is included here conventionally) 


pendix G, equation 
for the KG operator 


Ot 1 2 Ox 1 2 
The four-dimensional generalization is immediate. 


6.4 Verify (6.90). 
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6.5 Verify (6.92). 
6.6 Verify (6.99) and (6.100). 


6.7 Show that the contribution of the contractions (6.88) to the S-matrix 
element (6.74) is given by (6.101). 


6.8 Consider the case of equal masses ma = mp = mc. Evaluate u of (6.104) 
in the CM frame (compare section 1.3.6), and show that u < 0, so that u 
can never equal m2, in (6.100). (This result is generally true for such single 
particle ‘exchange’ processes.) 


6.9 Verify (6.112). 
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Quantum Field Theory III: Complex Scalar 
Fields, Dirac and Maxwell Fields; 


Introduction of Electromagnetic Interactions 


In the previous two chapters we have introduced the formalism of relativistic 
quantum field theory for the case of free real scalar fields obeying the Klein— 
Gordon (KG) equation of section 3.1, extended it to describe interactions 
between such quantum fields and shown how the Feynman rules for a simple 
Yukawa-like theory are derived. It is now time to return to the unfortunately 
rather more complicated real world of quarks and leptons interacting via gauge 
fields — in particular electromagnetism. For this, several generalizations of the 
formalism of chapter 5 are necessary. 

First, a glance back at chapter 2 will remind the reader that the electro- 
magnetic interaction has everything to do with the phase of wavefunctions, 
and hence presumably of their quantum field generalizations: fields which are 
real must be electromagnetically neutral. Indeed, as noted very briefly in 
section 5.3, the quanta of a real scalar field are their own antiparticles; for 
a given mass, there is only one type of particle being created or destroyed. 
However, physical particles and antiparticles have identical masses (e.g. e7 and 
et), and it is actually a deep result of quantum field theory that this is so (see 
section 4.2.5, and the end of section 7.1). In this case for a given mass m, there 
will have to be two distinct field degrees of freedom, one of which corresponds 
somehow to the ‘particle’, the other to the ‘antiparticle’. This suggests that we 
will need a complex field if we want to distinguish particle from antiparticle, 
even in the absence of electromagnetism (for example, the (K°, K?) pair). Such 
a distinction will have to be made in terms of some conserved quantum number 
(or numbers), having opposite values for ‘particle’ and ‘antiparticle’. This 
conserved quantum number must be associated with some symmetry. Now, 
referring again to chapter 2, we recall that electromagnetism is associated with 
invariance under local U(1) phase transformations. Even in the absence of 
electromagnetism, however, a theory with complex fields can exhibit a global 
U(1) phase invariance. As we shall show in section 7.1, such a symmetry 
indeed leads to the existence of a conserved quantum number, in terms of 
which we can distinguish the particle and antiparticle parts of a complex 
scalar field. 

In section 7.2 we generalize the complex scalar field to the complex spinor 
(Dirac) field, suitable for charged spin-4 particles. Again we find an analogous 
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conserved quantum number, associated with a global U(1) phase invariance of 
the Lagrangian, which serves to distinguish particle from antiparticle. Cen- 
tral to the satisfactory physical interpretation of the Dirac field will be the 
requirement that it must be quantized with anticommutation relations — the 
famous ‘spin-statistics’ connection. 

The electromagnetic field must then be quantized, and section 6.3 describes 
the considerable difficulties this poses. With all this in place, we can easily 
introduce (section 7.4) electromagnetic interactions via the ‘gauge principle’ 
of chapter 2. The resulting Lagrangians and Feynman rules will be applied to 
simple processes in the following chapter. In the final section of this chapter, 
we return to the discrete symmetries of chapter 4, and extend them from the 
single particle theory to quantum field theory. 


a 


7.1 The complex scalar field: global U(1) phase 
invariance, particles and antiparticles 


Consider a Lagrangian for two free fields Qı and Q2 having the same mass M: 
Ê = tô h10" hi — 4M? 43 + 40,620"b2 — $M793. (7.1) 


We shall see how this is appropriate to a ‘particle-antiparticle’ situation. 

In general ‘particle’ and ‘antiparticle’ are distinguished by having opposite 
values of one or more conserved additive quantum numbers. Since these quan- 
tum numbers are conserved, the operators corresponding to them commute 
with the Hamiltonian and are constant in time (in the Heisenberg formulation 
— see equation (5.59)); such operators are called symmetry operators and will 
be increasingly important in later chapters. For the present we consider the 
simplest case in which ‘particle’ and ‘antiparticle’ are distinguished by having 
opposite eigenvalues of just one symmetry operator. This situation is already 
realized in the simple Lagrangian of (7.1). The symmetry involved is just this: 
L of (7.1) is left unchanged (is invariant) if ĝı and $2 are replaced by VA and 
Èh, where (cf (2.64)) 


b, = (cosa)ġı — (sin a)ġz 
VA = (sin a)ĝı + (cos a)ĝz 


where a is a real parameter. This is like a rotation of coordinates about the z- 
axis of ordinary space, but of course it mixes field degrees of freedom, not spa- 
tial coordinates. The symmetry transformation of (7.2) is sometimes called an 
‘O(2) transformation’, referring to the two-dimensional rotation group O(2). 
We can easily check the invariance of L ie: 


L(G, 2) = L(d1, 02); (7.3) 


(7.2) 


see problem 7.1. 
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Now let us see what is the conservation law associated with this symmetry. 
It is simpler (and sufficient) to consider an infinitesimal rotation characterized 
by the infinitesimal parameter €, for which cose ~ 1 and sine % € so that (7.2) 
becomes 


a ae 
= (7.4) 
2 = p2 + EQ 
and we can define changes 56; by 
ĝi = ĝi- ĝi = —€dy 
(7.5) 


ô$ = $h — bo = +eor. 


Under this transformation Lis invariant, and so 6£=0. But £ is an explicit 
function of $1, ¢2, O.¢1 and O,¢2. Thus we can write 
A o£ . o£ z, Ê 
= ôL = ———— 8 (ð p1) + ———45(On¢2) + — 5d. + ƏL 53 (7.6) 
(dahı) O(On.G2) ddr Og2 

This is a bit like the manipulations leading up to the derivation of the Euler— 
Lagrange equations in section 5.2.4, but now the changes 6¢; (i = 1,2) have 
nothing to do with space-time trajectories — they mix up the two fields. How- 
ever, we can use the equations of motion for A and dbo to rewrite 6£ as 


al r 
= ———ô(ô ĝi —ô(ð e2 
(Ou¢1) + aa 2) 


aL ‘ OL 
+ b(a an e e 


Since (ð ĝi) = ,(5¢;), the right-hand side of (7.7) is just a total divergence, 
and (7.7) becomes 


Ə ; OÈ a 
0 = 0, | ———0¢1 + ———4¢2| . (7.8) 
(O61) TORA 
These formal steps are actually perfectly general, and will apply whenever 


a certain Lagrangian depending on two fields Qi and ¢» is invariant under 
Qi — ĝi + ĝi. In the present case, with 64; given by (7.5), we have 


al ae. 4 
ôu |- —— eda + ————€1 
| soa? * waa 
= €0,[(0"d2)d1 — (O"b1) 42] (7.9) 


where the free-field Lagrangian (7.1) has been used in the second step. Since 
€ is arbitrary, we have proved that the 4-vector operator 


= $10" bo — 20" Q1 (7.10) 
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is conserved: i 
ôL Ng =0. (7.11) 


Such conserved 4-vector operators are called symmetry currents, often denoted 
generically by J“. There is a general theorem (due to Noether (1918) in the 
classical field case) to the effect that if a Lagrangian is invariant under a 
continuous transformation, then there will be an associated symmetry current. 
We shall consider Noether’s theorem again in volume 2. 

What does all this have to do with symmetry operators? Written out in 
full, (7.11) is 


ANS/dt+V- Ng =0. (7.12) 
Integrating this equation over all space, we obtain 
d ‘ se 
nal N3 a + N,-dS =0 (7.13) 
dt V= Soo 


where we have used the divergence theorem in the second term. Normally the 
fields may be assumed to die off sufficiently fast at infinity that the surface 
integral vanishes (by using wave packets, for example), and we can therefore 
deduce that the quantity No is constant in time, where 


Ng = px da (7.14) 


that is, the volume integral of the u = 0 component of a symmetry current is 
a symmetry operator. 

In order to see how No serves to distinguish ‘particle’ from ‘antiparticle’ 
in the simple example we are considering, it turns out to be convenient to 
regard ĝi and dbo as components of a single complex field 


o= sa (¢1 — ide) 
A A z LAD 
gi = wor + i¢g). ve 


The plane-wave expansions of the form (5.155) for Qı and dy imply that ¢ has 
the expansion 
` dk , R , 
= | ——~ c [â (kje? + bt (k)i? 7.16 
b= [am (tet (7.16) 


where 


(7.17) 


and w = (M? + k?)!/2. The operators â, ât, 6, bt obey the commutation 
relations 


(7.18) 
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with all others vanishing; this follows from the commutation relations 
[ai(k), a'(k’)] = 5,j(27)°5(k—k') ete (7.19) 


for the â; operators. Note that two distinct mode operators, â and b, are 
appearing in the expansion (7.16) of the complex field. 


In terms of this complex @ the Lagrangian of (7.1) becomes 
Ê = ð 010" — M746 (7.20) 


and the Hamiltonian is (dropping the zero-point energy, i.e. normally ordering) 


A 3 A A 
a= [oo [ât (k)a(k) + bt (k)b(k)|w. (7.21) 


The O(2) transformation (7.2) becomes a simple phase change 
Q =e} (7.22) 


which (see comment (iii) of section 2.6) is called a global U(1) phase transfor- 
mation; plainly the Lagrangian (7.20) is invariant under (7.22). The associated 
symmetry current Ni’ becomes 


Ng = i(gtang — pagt) (7.23) 


and the symmetry operator No is (see problem 7.2) 
N= f ESA) = oa be (7.24) 
= —_|a a = 4 a 
et One 


Note that No has been normally ordered in anticipation of our later vacuum 
definition (7.30), so that Nz|0) = 0. 

We now observe that the Hamiltonian (7.21) involves the sum of the num- 
ber operators for ‘a’ quanta and ‘b’ quanta, whereas No involves the difference 
of these number operators. Put differently, Ng counts +1 for each particle of 
type ‘a’ and —1 for each of type ‘b’. This strongly suggests the interpretation 
that the b’s are the antiparticles of the a’s: No is the conserved symmetry 
operator whose eigenvalues serve to distinguish them. For a general state, the 
eigenvalue of No is the number of a’s minus the number of anti-a’s and it is 
a constant of the motion, as is the total energy, which is the sum of the a 
energies and anti-a energies. 

We have here the simplest form of the particle—antiparticle distinction: 
only one additive conserved quantity is involved. A more complicated example 
would be the (Kt, K~) pair, which have opposite values of strangeness and of 
electric charge. Of course, in our simple Lagrangian (7.20) the electromagnetic 
interaction is absent, and so no electric charge can be defined (we shall remedy 
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this later); the complex field é would be suitable (in respect of strangeness) 
for describing the (K°, K?) pair. 

The symmetry operator No has a number of further important properties. 
First of all, we have shown that dNg /dt = 0 from the general (Noether) 
argument, but we ought also to check that 


[Ñs H] =0 (7.25) 


as is required for consistency, and expected for a symmetry operator. This is 
indeed true (see problem 7.2(a)). We can also show 


— (7.26) 
[No ¢"] = ot 
and, by expansion of the exponential (problem 7.2(b)), that 
B(a)b0-1(a) = e$ = g (7.27) 
with . o 
U(a) = Ns, (7.28) 


This shows that the unitary operator U(a) effects finite U(1) rotations. 
Consider now a state |Ny) which is an eigenstate of Ny with eigenvalue 
Ng. What is the eigenvalue of Ng for the state ¢|N,)? It is easy to show, 
using (7.26), that 
Nodl No) = (No — 1) 4lNo) (7.29) 


so the application of éb to a state lowers its No eigenvalue by 1. This is 
consistent with our interpretation that the ĝ field destroys particles ‘a’ via 
the â piece in (7.16). (This ‘$ destroys particles’ convention is the reason for 
choosing ¢ = ($1 — idz)/V2 in (7.15), which in turn led to the minus sign in 
the relation (7.26) and to the earlier eigenvalue Ny — 1.) That @ lowers the 
No eigenvalue by 1 is also consistent with the interpretation that the same 
field ĝ creates an antiparticle via the bt piece in (7.16). In the same way, by 
considering bt INe), one easily verifies that 4! increases Ng by 1, by creating 
a particle via ât or destroying an antiparticle via b. The vacuum state (no 
particles and no antiparticles present) is defined by 


a(k)|0) = 6(k)|0) =0 forall k. (7.30) 


As anticipated, therefore, the complex field b contains two distinct kinds 
of mode operator, one having to do with particles (with positive Ny), the 
other with antiparticles (negative Ny). Which we choose to call ‘particle’ and 
which ‘antiparticle’ is of course purely a matter of convention: after all, the 
negatively charged electron is always regarded as the ‘particle’, while in the 
case of the pions we call the positively charged 7* the particle. 
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FIGURE 7.1 
(a) For tı > t2, a ¢ particle (Nọ = 1) propagates from x2 to z1; (b) for t2 > tı 
an anti-¢ particle (Ny = —1) propagates from zı to z2. 


Feynman rules for theories involving complex scalar fields may be derived 
by a straightforward extension of the procedure explained in chapter 6. It 
is, however, worth pausing over the propagator . The only non-vanishing vev 
of the time-ordered product of two ¢ fields is (0|T'(¢(a1)¢'(x2))|0) (the vev’s 
of T(4¢) and T($tġt) vanish with the vacuum defined as in (7.30)). In sec- 
tion 6.3.2 we gave a pictorial interpretation of the propagator for a real scalar 
field; let us now consider the analogous pictures for the complex field. For 
tı > ty the time-ordered product is (21)¢* (x2); using the expansion (7.16) 
and the vacuum conditions (7.30), the only surviving term in the vev is that 
in which an ‘ât’ creates a particle (Ng = 1) at (a2, t2) and an ‘â’ destroys it 
at (xı, tı); the ‘b’ operators in (22) give zero when acting on |0), as do the 
‘bt’ operators in ĝt (x1) when acting on (0|. Thus for tı > tz we have the pic- 
torial interpretation of figure 7.1(a). For t2 > tı, however, the time-ordered 
product is ¢'(22)¢(a1). Here the surviving vev comes from the ‘bt’ in (<1) 
creating an antiparticle (NV, = —1) at zı, which is then annihilated by the 
‘b’ in ĝt (x2). This tz > tı process is shown in figure 7.1(b). The inclusion of 
both processes shown in figure 7.1 makes sense physically, following consider- 
ations similar to those put forward ‘intuitively’ in section 3.5.4: the process 
of figure 7.1(a) creates (say) a positive unit of Ny at x2 and loses a positive 
unit at xı, while another way of effecting the same ‘Ng transfer’ is to create 
an antiparticle of unit negative Ng at xı, and propagate it to x2 where it 
is destroyed, as in figure 7.1(b). It is important to be absolutely clear that 
the Feynman propagator (0|T(d(21)6'(a2))|0) includes both the processes in 
figures 7.1(a) and (b). 

In practice, as we found in section 6.3.2, we want the momentum-—space 
version of the propagator, i.e. its Fourier transform. As we also noted there 
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(a) (b) 


FIGURE 7.2 
Equivalent Feynman graphs for single W-exchange in ve + e7 —> Ve +e. 


(cf also appendix G), the propagator is a Green function for the KG operator 
(+m?) with mass parameter m ; in momentum-space this is just the inverse, 
(—k? + m?)-1. In the present case, since both œ and ¢! obey the same 
KG equation, with mass parameter M, we expect that the momentum-space 
version of (0|T'(¢(a1)4!(x2))|0) is also 


i 

k? — M? + ie en) 
This can be verified by inserting the expansion (7.16) into the vev of the 
T-product, and following the steps used in section 6.3.2 for the scalar case. 

In this (momentum-space) version, it is the ‘ie’ which keeps track of the 
‘particles going from 2 to 1 if tı > ty’ and ‘antiparticles going from 1 to 2 if 
t2 > ty’ (recall its appearance in the representation (6.93) of the all-important 
6-function). As in the scalar case, momentum-space propagators in Feynman 
diagrams carry no implied order of emission/absorption process; both the pro- 
cesses in figure 7.1 are always included in all propagators. Arrows showing 
‘momentum flow’ now also show the flow of all conserved quantum numbers. 
Thus the process shown in figure 7.2(a) can equally well be represented as in 
figure 7.2(b). 

There is one more bit of physics to be gleaned from (0|T'(4(21)¢" (a2))|0). 
As in the real scalar field case, the vanishing of the commutator at space-like 
separations 

[o(a1), $ (z2) =0 for (£1 — 22)? <0 (7.32) 


guarantees the Lorentz invariance of the propagator for the complex scalar 
field and of the S-matrix. But in this (complex) case there is a further twist 
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to the story. Evaluation of [6(a1), 6" (a2)| reveals (problem 7.3) that, in the 
region (£1 — £2)? < 0, the commutator is the difference of two functions (not 
field operators), one of which arises from the propagation of a particle from z2 
to x1, the other of which comes from the propagation of an antiparticle from 
zı to xə (just as in figure 7.1). Both processes must exist for this difference 
to be zero, and furthermore for cancellations between them to occur in the 
space-like region the masses of the particle and antiparticle must be identi- 
cal. In quantum field theory, therefore, ‘causality’ (in the sense of condition 
(7.32) — cf (6.82)) requires that every particle has to have a corresponding 
antiparticle, with the same mass and opposite quantum numbers. As we saw 
in chapter 4, these requirements are guaranteed by the CPT theorem, which 
is a consequence of very general principles of quantum field theory. 


(sss 
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I remember that when someone had tried to teach me about creation and 
annihilation operators, that this operator creates an electron, I said ‘how 
do you create an electron? It disagrees with the conservation of charge,’ 
and in that way I blocked my mind from learning a very practical scheme 
of calculation. 


—From the lecture delivered by Richard Feynman in Stockholm, Sweden, 
on 11 December 1965, when he received the Nobel Prize in physics, which 
he shared with Sin-itiro Tomonaga and Julian Schwinger. (Feynman 1966). 


We now turn to the problem of setting up a quantum field which, in its 
wave aspects, satisfies the Dirac equation (cf comment (5) in section 5.2.5), 
and in its ‘particle’ aspects creates or annihilates fermions and antifermions. 
Following the ‘Heisenberg-Lagrange—Hamilton’ approach of section 5.2.5, we 
begin by writing down the Lagrangian which, via the corresponding Euler— 
Lagrange equation, produces the Dirac equation as the ‘field equation’. The 
answer (see problem 7.4) is 


p =idtd +ivla-: Vy — myt By. (7.33) 

The relativistic invariance of this is more evident in y-matrix notation (prob- 
lem 4.3): 

Lo = Din", — my4. (7.34) 

We can now attempt to ‘quantize’ the field ~ by making a mode expansion 

in terms of plane-wave solutions of the Dirac equation, in a fashion similar to 


that for the complex scalar field in (7.16). We obtain (see problem 3.8 for the 
definition of the spinors u and v, and the attendant normalization choice) 


p= Je sa È b u(k, s)e** + di (kju(k, s)e**], (7.35) 


s=1,2 


192 7. Quantum Field Theory HT 


where w = (m? + k”)!/?. We wish to interpret é!(k) as the creation operator 
for a Dirac particle of spin s and momentum k. By analogy with (7.16), we 
expect that di (k) creates the corresponding antiparticle. Presumably we must 
define the vacuum by (cf (7.30)) 


é.(k)|0) =d,(k)|0) =O forall k and s = 1,2. (7.36) 
A two-fermion state is then 
|kı, S1; ka, s2) xX at (ka a (k2)|0). (7.37) 


But it is here that there must be a difference from the boson case. We require 
a state containing two identical fermions to be antisymmetric under the ex- 
change of state labels kı © k2, sı  S2, and thus to be forbidden if the two 
sets of quantum numbers are the same, in accordance with the Pauli exclusion 
principle, responsible for so many well-established features of the structure of 
matter. 

The solution to this dilemma is simple but radical: for fermions, commuta- 
tion relations are replaced by anticommutation relations! The anticommutator 
of two operators A and B is written: 


{A,B} = AB+ BA. (7.38) 
If two different ¢’s anticommute, then 
al (ky )é (k2) + at 5 (ke) at (ki) =0 (7.39) 
so that we have the desired antisymmetry 
|k1, 81; k2, $2) = —|ko, $2; k1, 81). (7.40) 
In general we postulate 


{és1 (kı), êl, (k2)} = (27)? 5° (kı = k2)0s1 sz 
{€a, (k1), Geo (k2)} = {ê}, (k1), ê, (ka) } = 0 


and similarly for the d’s and d'’s. The factor in front of the 5-function depends 
on the convention for normalizing Dirac wavefunctions. 

We must at once emphasize that in taking this ‘replace commutators by 
anticommutators’ step we now depart decisively from the intuitive, quasi- 
mechanical, picture of a quantum field given in chapter 5, namely as a system 
of quantized harmonic oscillators. Of course, the field expansion (7.35) is 
a linear superposition of ‘modes’ (plane-wave solutions), as for the complex 
scalar field in (7.16) for example; but the ‘mode operators’ ĉs and dl are 
fermionic (obeying anticommutation relations) not bosonic (obeying commu- 
tation relations). As mentioned at the end of section 5.1, it does not seem 
possible to provide any mechanical model of a system (in three dimensions) 


(7.41) 
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whose normal vibrations are fermionic. Correspondingly, there is no con- 
cept of a ‘classical electron field’, analogous to the classical electromagnetic 
field (which doubtless explains why we tend to think of fermions as basically 
‘more particle-like’). However, we can certainly recover a quantum mechani- 
cal wavefunction from (7.35) by considering, as in comment (5) of section 5.4, 
the vacuum-to-one-particle matrix element (0|1)(a, t)|k1, 51). 

In the bosonic case, we arrived at the commutation relations (5.130) for the 
mode operators by postulating the ‘fundamental commutator of quantum field 
theory’, equation (5.117), which was an extension to fields of the canonical 
commutation relations of quantum (particle) mechanics. For fermions, we 
have simply introduced the anticommutation relations (7.41) ‘by hand’, so 
as to satisfy the Pauli principle. We may ask: What then becomes of the 
analogous ‘fundamental commutator’ in the fermionic case? A plausible guess 
is that, as with the mode operators, the ‘fundamental commutator’ is to be 
replaced by a ‘fundamental anticommutator’, between the fermionic field w 
and its ‘canonically conjugate momentum field’ 7p, of the form: 


{V(x t), t(y, t)} = ið(x — y). (7.42) 


As far as 7p is concerned, we may suppose that its definition is formally 
analogous to (5.122), which would yield 


itp = —— =i". (7.43) 


We must also not forget that both w and #p are four-component objects, 
carrying spinor indices. Thus we are led to expect the result 


{ba(x, t), ý$ (y, t)} = (a = Y) bap; (7.44) 


where a and £ are spinor indices. It is a good exercise to check, using (7.41), 
that this is indeed the case (problem 7.5). We also find 


{h(a,t), dy, t)} = {41 (æ, t), dy, t)} =0. (7.45) 


In this (anticommutator) sense, then, we have a ‘canonical’ formalism for 
fermions. 
The Dirac Hamiltonian density is then (cf (5.123)) 


Üo Shp — Lp Stas 190 + môt bb (7.46) 


using (7.43) and (7.33), and the Hamiltonian is 


Hp = / lýta- i Vý + mý By] Ba. (7.47) 


194 7. Quantum Field Theory HT 


One may well wonder why things have to be this way — ‘bosons commute, 
fermions anticommute’. To gain further insight, we turn again to a consider- 
ation of symmetries and the question of particle and antiparticle — this time 
for the Dirac field, rather than the Dirac wavefunction discussed in chapter 4. 

The Dirac field w is a complex field, as is reflected in the two distinct mode 
operators in the expansion (7.35); as in the complex scalar field case, there 
is only one mass parameter and we expect the quanta to be interpretable as 
particle and antiparticle. The symmetry operator which distinguishes them is 
found by analogy with the complex scalar field case. We note that Lp ( the 
quantized version of (7.34)) is invariant under the global U(1) transformation 


pp =e ing (7.48) 


which is Ss a N 
bod =h—ied (7.49) 
in infinitesimal form. The corresponding (Noether) symmetry current can be 
calculated as . a. 
NE = dod (7.50) 


and the associated symmetry operator is 
Ny = foi da. (7.51) 


Ny is clearly a number operator for the fermion case. As for the complex 
scalar field, invariance under a global U(1) phase transformation is associated 
with a number conservation law. 

Inserting the plane-wave expansion (7.35), we obtain, after some effort 
(problem 7.6), 


i ak Sate 
Ñ= | =~ al (k)ês(k) + ds (k)di(k)]. 52 
v= faye D EOR dod) (7.52) 
Similarly the Dirac Hamiltonian may be shown to have the form (problem 7.6) 


3 5 x 
fin = f ot S> fel (h)64(h) — (a )dt (J. (7.53) 


s=1,2 


It is important to state that in obtaining (7.52) and (7.53), we have not as- 
sumed either commutation or anticommutation relations for the mode opera- 
tors é, é, d and dt, only properties of the Dirac spinors; in particular, neither 
(7.52) nor (7.53) has been normally ordered. Suppose now that we assume 
commutation relations, so as to rewrite the last terms in (7.52) and (7.53) in 
normally ordered form as di(k)ds(k). We see that Hp will then contain the 
difference of two number operators for ‘c’ and ‘d’ particles, and is therefore 
not positive-definite as we require for a sensible theory. Moreover, we suspect 
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that, as in the ¢ case, the ‘d’s’ ought to be the antiparticles of the ‘c’s’, carry- 
ing opposite Ny value: but Ny is then (with the previous assumption about 
commutation relations) just proportional to the sum of ‘c’ and ‘d’ number 
operators, counting +1 for each type, which does not fit this interpretation. 
However, if anticommutation relations are assumed, both these problems dis- 
appear: dropping the usual infinite terms, we obtain the normally ordered 
forms 


ea 3 F- 3 
Ny = / oa > [et (kès (k) — dt (kjå, (k) (7.54) 


s=1,2 


A 3 . i 
nE S J oa XO [ab (k)ês(k) + dl (k)ds(k)jw (7.55) 


s=1,2 


which are satisfactory, and allow us to interpret the ‘d’ quanta as the antipar- 
ticles of the ‘c’ quanta. Similar difficulties would have occurred in the complex 
scalar field case if we had assumed anticommutation relations for the boson 
operators, and the ‘causality’ discussion at the end of the preceding section 
would not have worked either (instead of a difference of terms we would have 
had asum). It is in this way that quantum field theory enforces the connection 
between spin and statistics. 

Our discussion here is only a part of a more general approach leading to 
the same conclusion, first given by Pauli (1940); see also Streater et al. (1964). 

As in the complex scalar case, the other crucial ingredient we need is the 
Dirac propagator (0|T'(¢(21)¢(a2))|0). We shall see in section 7.4 why it is ~ 
here rather than wt — the reason is essentially to do with Lorentz covariance 
(see section 4.1.2). Because the 7) fields are anticommuting, the T-symbol 
now has to be understood as 


T(b(xi)P(z2)) = V(ei)P(t2) forti > te (7.56) 


= —ý (x2) (x1) for tə > tı. (7.57) 
Once again, this propagator is proportional to a Green function, this time 


for the Dirac equation, of course. Using y-matrix notation (problem 4.3) the 
Dirac equation is (cf (7.34)) 


(iy"O, — m)w = 0. (7.58) 


The momentum-space version of the propagator is proportional to the inverse 
of the operator in (7.58), when written in k-space, namely to (kK —m)~! where 


(7.59) 


is an important shorthand notation (pronounced ‘k-slash’). In fact, the Feyn- 
man propagator for Dirac fields is 
i 


ears (7.60) 
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As in (7.31), the ie takes care of the particle/antiparticle, emission/absorption 
business. Formula (7.60) is the fermion analogue of ‘rule (ii)’ in (6.103). 

The reader should note carefully one very important difference between 
(7.60) and (7.31), which is that (7.60) is a 4x4 matriz. What we are re- 


ally saying (cf (6.98)) is that the Fourier transform of (OJT (a1) (a2))|0), 
where a and p run over the four components of the Dirac field, is equal to the 


(a, B) matrix element of the matrix i(f — m + ie)~?: 


fie — £2) e @1=22) (IT (hq (a1) tg (#2))|0) = i(f — m+ ie)53.| (7.61) 


The form (7.61) can be made to look more like (7.31) by making use of the 
result (problem 7.7) 
(# — m)(# + m) = (k? — m’) (7.62) 
(where the 4x4 unit matrix is understood on the right-hand side) so as to 
write (7.61) as 
i(k +m) 
k2? — m? + ie 
As in the scalar case, (7.61) can be directly verified by inserting the field 
expansion (7.35) into the left-hand side, and following steps analogous to those 
in equations (6.92)—(6.98). In following this through one will meet the expres- 
sions `, u(k, s)u(k,s) and `, u(k, s)a(k, s), which are also 4 x 4 matrices. 
Problem 7.8 shows that these quantities are given by 


di ualk,s)ūg(k,s) = (#+m)as J valk, s)ūp(k,s) = (#-m)ap. (7-64) 


(7.63) 


With these results, and remembering the minus sign in (7.57), one can check 
(7.63) (problem 7.9). 

One might now worry that the adoption of anticommutation relations for 
Dirac fields might spoil ‘causality’, in the sense of the discussion after (7.32). 
One finds, indeed, that the fields 7) and 7) anticommute at space-like separa- 
tion, but this is enough to preserve causality for physical observables, which 
will involve an even number of fermionic fields. 

We now turn to the problem of quantizing the Maxwell (electromagnetic) 
field. 


ee 
7.3 The Maxwell field A“ (x) 
7.3.1 The classical field case 


Following the now familiar procedure, our first task is to find the classical field 
Lagrangian which, via the corresponding Euler-Lagrangian equations, yields 
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the Maxwell equation for the electromagnetic potential A”, namely (cf (2.22)) 


A” — OY (0, A") = Jom: (7.65) 

The answer is (see problem 7.10) 
1 V sp 
L= -jfw FY” — jén Av (7.66) 
where Fv = 0 „Ay — 0 A,n. So the pure A-field part is the Maxwell Lagrangian 
1 
La = -jfw Pee (7.67) 
Before proceeding to try to quantize (7.67), we need to understand some 


important aspects of the free classical field A” (x). 
When jem is set equal to zero, A” satisfies the equation 


Ə, F*” = DA” — 8” (Ə! A,) = 0. (7.68) 


As we have seen in section 2.3, these equations are left unchanged if we perform 
the gauge transformation 


A! > Alt = AP — Bly, (7.69) 


We can use this freedom to choose the A” with which we work to satisfy the 


condition 
7.70) 


This is called the Lorentz condition. The process of choosing a particular 
condition on A“ so as to define it (ultimately) uniquely is called ‘choosing 
a gauge’; actually the condition (7.70) does not yet define A” uniquely, as 
we shall see shortly. The Lorentz condition is a very convenient one, since it 
decouples the different components of A“ in Maxwell’s equations (7.68) — in 
a covariant way, moreover, leaving the very simple equation 


A” = 0. (7.71) 
This has plane-wave solutions of the form 
At = Nee tke (7.72) 


with k? = 0 (i.e. k? = k?), where N is a normalization factor and e” is a 
polarization vector for the wave. The gauge condition (7.70) now reduces to 
a condition on é“: 

k-e=0. (7.73) 


However, we have not yet exhausted all the gauge freedom. We are still free 
to make another shift in the potential 


A" > A! — ay (7.74) 
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provided ¥ satisfies the massless KG equation 


%=0. (7.75) 


This condition on ¥ ensures that, even after the further shift, the resulting 
potential still satisfies 0,,A" = 0. For our plane-wave solutions, this residual 
gauge freedom corresponds to changing e” by a multiple of k”: 


eM — e! + Bk” = e" (7.76) 


which still satisfies e” -k = 0 since k? = 0 for these free-field solutions. The 
condition k? = 0 is, of course, the statement that a free photon is massless. 
This freedom has important consequences. Consider a solution with 


kt =(k?,k) (KPP =k? (7.77) 
and polarization vector 
e” = (ee) (7.78) 


satisfying the Lorentz condition 
k-e=0. (7.79) 


Gauge invariance now implies that we can add multiples of k” to e” and still 
have a satisfactory polarization vector. 

It is therefore clear that we can arrange for the time component of e” to 
vanish so that the Lorentz condition reduces to the 3-vector condition 


k-e=0. (7.80) 


This means that there are only two independent polarization vectors, both 
transverse to k, i.e. to the propagation direction. For a wave travelling in the 
z-direction (k” = (k°,0,0,k°)) these may be chosen to be 
€01) = (1, 0, 0) (7.81) 
€(2) = (0, 1,0). (7.82) 


Such a choice corresponds to linear polarization of the associated E and B 
fields — which can be easily calculated from (2.10) and (2.11), given 


Aly =NO,emje** i=1,2. (7.83) 
A commonly used alternative choice is 

4 
V2 
Laso (7.85) 


e(à = +1) = -—(1,i,0) (7.84) 


e(A = -1) = 
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(linear combinations of (7.81) and (7.82)), which correspond to circularly po- 
larized radiation. The phase convention in (7.84) and (7.85) is the standard 
one in quantum mechanics for states of definite spin projection (‘helicity’) 
à = +1 along the direction of motion (the z-axis here). We may easily check 
that 


E (A): E(X) = day (7.86) 


or, in terms of the corresponding 4-vectors e” = (0, €), 
€*(\) + €(A’) = -ôy (7.87) 


We have therefore arrived at the result, familiar in classical electromagnetic 
theory, that the free electromagnetic fields are purely transverse. Though they 
are described in this formalism by a vector potential with apparently four 
independent components (V, A), the condition (7.70) reduces this number by 
one, and the further gauge freedom exploited in (7.74)—(7.76) reduces it by 
one more. 

A crucial point to note is that the reduction to only two independent field 
components (polarization states) can be traced back to the fact that the free 
photon is massless: see the remark after (7.76). By contrast, for massive spin- 
1 bosons, such as the WF and Z°, all three expected polarization states are 
indeed present. However, weak interactions are described by a gauge theory, 
and the WF and Z° particles are gauge-field quanta, analogous to the photon. 
How gauge invariance can be reconciled with the existence of massive gauge 
quanta with three polarization states will be explained in volume 2. 

We may therefore write the plane-wave mode expansion for the classical 
At (x) field in the form 


A" (x)= J; T ars Le fe” ( k, Aja (k ie A c*(k, Na" (k, \jel*™] 


(7.88) 
where the sum is over the two possible polarization states À, for given k, as 
described by the suitable polarization vector e” (k, A) and w = |k]. 

It would seem that all we have to do now, in order to ‘quantize’ (7.88), is 
to promote a and a* to operators â and a‘, as usual. However, things are 
actually not nearly so simple. 


7.3.2 Quantizing A” (x) 


Readers familiar with Lagrangian mechanics may already suspect that quan- 
tizing A” is not going to be straightforward. The problem is that, clearly, 
A” (x) has four (Lorentz) components — but, equally clearly in view of the 
previous section, they are not all independent field components or field de- 
grees of freedom. In fact, there are only two independent degrees of freedom, 
both transverse. Thus there are constraints on the four fields, for instance the 
gauge condition (7.70). Constrained systems are often awkward to handle in 
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classical mechanics (see for example Goldstein 1980) or classical field theory; 
and they present major problems when it comes to canonical quantization. 
It is actually at just this point that the ‘path-integral’ approach to quantiza- 
tion, alluded to briefly at the end of section 5.2.2, comes into its own. This 
is basically because it does not involve non-commuting (or anticommuting) 
operators and it is therefore to that extent closer to the classical case. This 
means that the relatively straightforward procedures available for constrained 
classical mechanics systems can — when suitably generalized! — be efficiently 
brought to bear on the quantum problem. For an introduction to these ideas, 
we refer to Swanson (1992). 

However, we do not wish at this stage to take what would be a very long 
detour, in setting up the path-integral quantization of QED. We shall continue 
along the ‘canonical’ route. To see the kind of problems we encounter, let us 
try and repeat for the A” field the ‘canonical’ procedure we introduced in 
section 5.2.5. This was based, crucially, on obtaining from the Lagrangian the 
momentum 7 conjugate to ¢, and then imposing the commutation relation 
(5.117) on the corresponding operators 7 and . But inspection of our Maxwell 
Lagrangian (7.67) quickly reveals that 


ƏLA o (7.89) 
Ao 


and hence there is no canonical momentum 7° 


to be stymied before we can even start. 

There is another problem as well. Following the procedure explained in 
chapter 6, we expect that the Feynman propagator for the A" field, namely 
(0|T(A#(x1)A”(x2))|0), will surely appear, describing the propagation of a 
photon between x, and x2. In the case of real scalar fields, problem 6.3 
showed that the analogous quantity was actually a Green function for the 
KG differential operator, (O + m?). It turned out, in that case, that what 
we really wanted was the Fourier transform of the Green function, which was 
essentially (apart from the tricky ‘ie prescription’ and a trivial —i factor) the 
inverse of the momentum-space operator corresponding to (O + m7), namely 
(—k?+m?)~1 (see equation (6.98) and appendix G, and also (7.58)—(7.60) for 
the Dirac case). Suppose, then, that we try to follow this route to obtaining 
the propagator for the A” field. For this it is sufficient to consider the classical 
equations (7.68) with jem = 0, written in k space (problem 7.11(a)): 


conjugate to A°. We appear 


(—k?g’* + kk“) A, (k) = M**A,,(k) = 0 (7.90) 


where Å, (k) is the Fourier transform of A,,(x). We therefore require the 
inverse 


(=k? g" + kk”)! = (MTH. (7.91) 


Unfortunately it is easy to show that this inverse does not exist. From 
Lorentz covariance, it has to transform as a second-rank tensor, and the only 
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ones available are g” and k“k”. So the general form of (M~')”“ must be 
(M~*)"# = A(k?)g’* + B(k?)kY k”. (7.92) 
Now the inverse is defined by 
(MY Meo. (7.93) 
Putting (7.92) and (7.90) into (7.93) yields (problem 7.11(b)) 
—k? A(k?)g? + A(k?)k’ ko = g” (7.94) 


which cannot be satisfied. So we are thwarted again. 

Nothing daunted, the attentive reader may have an answer ready for the 
propagator problem. Suppose that, instead of (7.68), we start from the much 
simpler equation 


A” =0 (7.95) 


which results from imposing the Lorentz condition (7.70). Then, in momentum- 
space, (7.95) becomes 
-k AY = 0. (7.96) 


The ‘—k?’ on the left-hand side certainly has an inverse, implying that the 
Feynman propagator for the photon is (proportional to) g,,/k?. This form 
is indeed plausible, as it is very much what we would expect by taking the 
massless limit of the spin-0 propagator and tacking on guy to account for the 
Lorentz indices in (0|T(A,(a1)A,(az))|0) (but then why no term in kyky? — 
see the final two paragraphs of this section!). 

Perhaps this approach helps with the ‘no canonical momentum 7’ problem 
too. Let us ask: What Lagrangian leads to the field equation (7.95)? The 
answer is (problem 7.12) 


1 V 
Lr = —3Fw F" — 10, A"). (7.97) 


This form does seem to offer better prospects for quantization, since at least 
all our 7“’s are non-zero; in particular 


o OL 


The other 7’s are unchanged by the addition of the extra term in (7.97) and 
are given by 


m = -Ät + GA. (7.99) 


Interestingly, these are precisely the electric fields E’ (see (2.10)). Let us see, 
then, if all our problems are solved with £z. 
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Now that we have at least got four non-zero 7“’s, we can write down a 
plausible set of commutation relations between the corresponding operator 
quantities 7“ and A”: 


[Au (a, t), ĉu (y, t)] = igu (£ — y). (7.100) 


Again, the g,, is there to give the same Lorentz transformation character 
on both sides of the equation. But we must now remember that, in the 
classical case, our development rested on imposing the condition 0,A" = 0 
(7.70). Can we, in the quantum version we are trying to construct, simply 
impose ô, A" = 0? We certainly cannot do so in Êz, or we are back to £4 
again (besides, constraints cannot be ‘substituted back’ into Lagrangians, in 
general). Furthermore, if we set = v = 0 in (7.100), then the right-hand 
side is non-zero while the left-hand side is zero if ð, Â! =0=7°. So it is 
inconsistent simply to set ð, Â! =0. 

We will return to the treatment of ‘O, Al = 0’ eventually. First, let us press 
on with (7.97) and see if we can get as far as a (quantized) mode expansion, 
of the form (7.88), for A“ (x). 

To set this up, we need to massage the commutator (7.100) into a form 
as close as possible to the canonical ‘[¢, à] = id’ form. Assuming the other 
commutation relations (cf (5.118)) 


[Â (x,t), ÂL (y, t)] = [R (æ, t), ĉu ly, t)] =0 (7.101) 


we see that the spatial derivatives of the A’s commute with the A’s, and with 
each other, at equal times. This implies that we can rewrite the (quantum) 
T’s as ; 

it, = —A, + pieces that commute. (7.102) 


Hence (7.100) can be rewritten as 


[Â, (æt), Âs (y, t)] = -igu ð? (a — y) (7.103) 


and (7.101) remains the same. Now (7.103) is indeed very much the same 
s ‘[¢, 6] = id’ for the spatial component A’ — but the sign is wrong in the 
u =v = Q case. We are not out of the maze yet. 
Nevertheless, proceeding onwards on the basis of (7.103), we write the 
quantum mode expansion as (cf (7.88)) 


D-5 J ee On oa! [el (k, Naa (k)e TE + e" (k, ANAL (Re) (7.104) 


where the sum is over four independent polarization states À = 0, 1, 2, 3, since 
all four fields are still in play. Before continuing, we need to say more about 
these €’s (previously, we only had two of them, now we have four and they 
are 4-vectors). We take k to be along the z-direction, as in our discussion of 
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the ¢’s in section 7.3.1, and choose two transverse polarization vectors as (cf 
(7.81), (7.82)) 


e(k, X= 1) = (0,1,0,0) 


‘transverse polarizations’. (7.105) 
e” (k, A = 2) = (0,0, 1,0) 
The other two e’s are 
e” (k, A = 0) = (1, 0,0, 0) ‘time-like polarization’ (7.106) 
and 
e” (k, A = 3) = (0,0, 0, 1) ‘longitudinal polarization’. (7.107) 


Making (7.104) consistent with (7.103) then requires 
[aa (k), OL, (k’)] = -gax (27)? 8 (k — k’). (7.108) 


This is where the wrong sign in (7.103) has come back to haunt us: we have 
the wrong sign in (7.108) for the case A = \’ = 0 (time-like modes). 

What is the consequence of this? It seems natural to assume that the 
vacuum is defined by 


€y(k)|0)=0 for all A = 0,1,2,3. (7.109) 


But suppose we use (7.108) and (7.109) to calculate the normalization overlap 
of a ‘one time-like photon’ state; this is 


(k', A = 0|k, à = 0) (0| 40 (ke) (k’)|0) 
= —(2r)’8 (k — k’) (7.110) 


and the state effectively has a negative norm (the k = k’ infinity is the stan- 
dard plane-wave artefact). Such states would threaten fundamental properties 
such as the conservation of total probability if they contributed, uncancelled, 
in physical processes. 

At this point we would do well to recall the condition ‘Op AY = 0’, which 
still needs to be taken into account, somehow, and it does indeed save us. 
Gupta (1950) and Bleuler (1950) proposed that, rather than trying (unsuc- 
cessfully) to impose it as an operator condition, one should replace it by the 
weaker condition 

ð AX (x) |G) = 0 (7.111) 


where the (+) signifies the positive frequency part of A, i.e. the part involving 
annihilation operators, and |W) is any physical state (including |0)). From 
(7.111) and its Hermitian conjugate 


(Tð Â” (x) = 0 (7.112) 
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we can deduce that the Lorentz condition (7.70) does hold for all expectation 
values: 

(D0, A" |v) = (ba, A“ + 0, A" |b) = 0, (7.113) 
and so the classical limit of this quantization procedure will recover the clas- 
sical Maxwell theory in Lorentz gauge. 

Using (7.104), (7.106) and (7.107) with k” = (|k|,0,0,|kl|), condition 
(7.111) becomes 
[ao(k) — @3(k)] |) = 0. (7.114) 


To see the effect of this condition, consider the expression for the Hamiltonian 
of this theory. In normally ordered form, it turns out to be 


‘ d3k ae o apa ata 
A = | Ea Glân + ahaa + ahaa — Ado) (7.115) 


so the contribution from the time-like modes looks dangerously negative. How- 
ever, for any physical state |Y), we have 


(U|(âbâs — âjâo)|t) = (U|(ahas — Abdo) |W) 
= (W\a}(a3 — do) |¥) 
= 0, (7.116) 


so that only the transverse modes survive. 

We hope that by now the reader will have at least begun to develop a 
healthy respect for quantum gauge fields — and the non-Abelian versions in 
volume 2 are even worse! The fact is that the canonical approach has a difficult 
time coping with these constrained systems. Indeed, the complete Feynman 
rules in the non-Abelian case were found by an alternative quantization pro- 
cedure (‘path integral’ quantization). This, however, is outside the scope of 
the present volume. The important points for our purposes are as follows. It 
is possible to carry out a consistent quantization in the Gupta-Bleuler for- 
malism, which is the quantum version of the Maxwell theory constrained by 
the Lorentz condition. The propagator for the photon in this theory is 


—ig'” /k? + ie (7.117) 


which is the expected massless limit of the KG propagator as far as the spatial 
components are concerned (the time-like component has that negative sign). 

As in all the other cases we have dealt with so far, the Feynman propagator 
(OIT (Â! (x1) Â” (x2))|0) can be evaluated using the expansion (7.104) and the 
commutation relations (7.108). One finds that it is indeed equal to the Fourier 
transform of —ig”” /k? + ie just as asserted in (7.117). For this result, we need 
the ‘pseudo completeness relation’ (problem 7.13) 


-E(k A = O)e"(k, A = 0) + €#(k, A= 1)” (k, A= 1) 
+ (k, A = 2)” (k, A = 2) + e” (k, A = 3)” (k, A = 3) = —gh”. 
(7.118) 
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We call this a pseudo completeness relation because of the minus sign appear- 
ing in the first term: its origin in the evaluation of this vev is precisely the 
‘wrong sign commutator’ for the âo mode, (7.108). 

Thus the gauge choice (7.70) can be made to work in quantum field theory 
via the condition (7.111). But other choices are possible too. In particular, a 
useful generalization of the Lagrangian (7.97) is 


1 1 
Le = -FEWE — OA) (7119) 


where € is a constant, the ‘gauge parameter’. Le leads to the equation of 
motion (problem 7.14) 


( Juv — O,0y + +042.) AY =0. (7.120) 


In momentum-space this becomes (problem 7.14) 


shy) A” =0. (7.121) 


( Bae kyky 


The inverse of the matrix acting on A” exists, and gives us the more general 
photon propagator (or Green function) 


: v i r 
li a La (7.122) 
k? + ie 
as shown in problem 7.14. The previous case is recovered as é — 1. Confus- 
ingly, the choice € = 1 is often called the ‘Feynman gauge’, though in classical 
terms it corresponds to the Lorentz gauge choice. For some purposes the ‘Lan- 
dau gauge’ € = 0 (which is well defined in (7.122)) is convenient. In any event, 
it is important to be clear that the photon propagator depends on the choice 
of gauge. Formula (7.122) is the photon analogue of ‘rule (ii)’ in (6.103). 
This may seem to imply that when we use the photon propagator (7.122) 
in Feynman amplitudes we will not get a definite answer, but rather one 
that depends on the arbitrary parameter €. This is a serious worry. But the 
propagator is not by itself a physical quantity — it is only one part of a physical 
amplitude. In the following chapter we shall derive the amplitudes for some 
simple processes in scalar and spinor electrodynamics, and one can verify that 
they are gauge invariant — either in the sense (for external photons) of being 
invariant under the replacement (7.76), or (in the case of internal photons) of 
being independent of €. It can be shown (Weinberg 1995, section 10.5) that 
at a given order in perturbation theory the sum of all diagrams contributing 
to the S-matrix is gauge invariant. 
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7.4 Introduction of electromagnetic interactions 


After all these preliminaries, the job of introducing the first of our gauge 
field interactions, namely electromagnetism, into our non-interacting theory 
of complex scalar fields, and of Dirac fields, is very easy. From our discussion 
in chapter 2, we have a strong indication of how to introduce electromagnetic 
interactions into our theories. The ‘gauge principle’ in quantum mechanics 
consisted in elevating a global (space-time-independent) U(1) phase invariance 
into a local (space-time-dependent) U(1) invariance — the compensating fields 
being then identified with the electromagnetic ones. In quantum field theory, 
exactly the same principle exists and leads to the form of the electromagnetic 
interactions. Indeed, in the field theory formalism we have a true local U(1) 
phase (gauge) invariance of the Lagrangian (rather than a gauge covariance 
of a wave equation) and we shall be able to exhibit explicitly the symmetry 
current, and symmetry operator, associated with the U(1) invariance — and 
identify them precisely with the electromagnetic current and charge. 

We have seen that for both the complex scalar and the Dirac fields the 
free Lagrangian is invariant under U(1) transformations (see (7.22) and (7.48)) 
which, we once again emphasize, are global. Let us therefore promote these 
global invariances into local ones in the way learned in chapter 2 — namely by 
invoking the ‘gauge principle’ replacement 


Ə! —> DY = Ə" + ig A" (7.123) 


for a particle of charge q, this time written in terms of the quantum field AY. 
In the case of the Dirac Lagrangian 


Lp = bliy"d,, — m)b (7.124) 


we expect to be able to ‘promote’ it to one which is invariant under the local 
U(1) phase transformation! 


w(x, t) > oh! (x,t) = eX" (ax, t) (7.125) 


provided we make the replacement (7.123) and demand that the (quantized) 
4-vector potential transforms as (cf (2.15) with the sign change for $) 


AY + Al = Â! + OF. (7.126) 


Thus the locally U(1)-invariant Dirac Lagrangian is expected to be 


Lp local = liy Dy = my). (7.127) 


‘Note that the classical field x(æ,t) of (2.34) has become a quantum field X(a,t) in 
(7.125); the sign change of Y compared with x is conventional in qft. 
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The invariance of (7.127) under (7.125) is easy to check, using the crucial 
property (2.43), which clearly carries over to the quantum field case: 


Dip! = 7% (Dy). (7.128) 
Equation (7.128) implies at once that 
(iy DI, — mW! = (4D, — myo, (7.129) 
while taking the conjugate of (7.125) yields 
b = ven, (7.130) 


Thus we have 


pele (iyt D, — m)ib (7.131) 


= ýliy D, -myi (7.132) 


$ (i Dt, =m)! 


and the invariance is proved. 
The Lagrangian has therefore gained an interaction term 


Lp = Lp local = Lp F Liri (7.133) 


where 


Lint = -qiy Ân. (7.134) 


Since the addition of Lint has not changed the canonical momenta, the Hamil- 
tonian then becomes H = Hp + Hp, where 


Åb = —Lim = qý Â, = ab bAy — qitad- A (7.135) 


which is the field theory analogue of the potential in (3.102). It has the 
expected form ‘pAg — j- A’ if we identify the electromagnetic charge density 
operator with qytw (the charge times the number density operator) and the 
electromagnetic current density operator with ght an. The electromagnetic 
4-vector current operator j£, is thus identified as 


jin = aby, (7.136) 


which is gauge invariant and a Lorentz 4-vector. The Lagrangian (7.134) is 
manifestly Lorentz invariant. 
We now note that j#, is just q times the symmetry current Ni of sec- 


tion 7.2 (see equation (7.50)). Conservation of j/, would follow from global 
U(1) invariance alone (i.e. ¥ a constant in equation (7.125)); but many La- 
grangians, including interactions, could be constructed obeying this global 
U(1) invariance. The force of the local U(1) invariance requirement is that it 
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FIGURE 7.3 - 
Possible basic ‘vertices’ associated with the interaction density ep pay; 
these cannot occur as physical processes due to energy-momentum con- 
straints. 


has specified a unique form of the interaction (i.e. Lint of equation (7.134)). 
Indeed, this is just jt An; so that in this type of theory the current 7, is 
not only asymmetry current, but also determines the precise way in which the 
vector potential Â” couples to the matter field a. Adding the Lagrangian for 
the A” field then completes the theory of a charged fermion field interacting 
with the Maxwell field. In a general gauge, the A" field Lagrangian is the 
operator form of (7.119), Le. 

The interaction term He — quybA, is a ‘three-fields-at-a-point’ kind of 
interaction just like our 3-scalar interaction gbadbsec in chapter 6. We know, 
by now, exactly what all the operators in Ht, are capable of: some of the 
possible emission and absorption processes are shown in figure 7.3. Unlike the 
‘ABC’ model with mc > ma +mg however, none of these elementary ‘vertex’ 
processes can occur as a real physical process, because all are forbidden by 
the requirement of overall 4-momentum conservation. However, they will of 
course contribute as virtual transitions when ‘paired up’ to form Feynman 
diagrams, such as those in figure 7.4 (compare figures 6.4 and 6.5). 

It is worth remarking on the fact that the ‘coupling constant’ q is dimen- 
sionless, in our units. Of course, we know this from its identification with the 
electromagnetic charge in this case (see appendix C). But it is instructive to 
check it as follows. A Lagrangian density has mass dimension M7‘, since the 
action is dimensionless (with A = 1). Referring then to (7.33) we see that the 
(mass) dimension of the 4% field is M3/?, while (7.67) shows that that of Â” 


is M. It follows that wybA, has mass dimension M+, and hence q must be 
dimensionless. 

The application of the Dyson formalism of chapter 6 to fermions interacting 
via Hj leads directly to the Feynman rules for associating precise mathemat- 
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FIGURE 7.4 
Lowest-order contributions to ye~ —> ye7. 


ical formulae with diagrams such as those in figure 7.4, as usual. This will 
be presented in the following chapter: see comment (3) in section 8.3.1 and 
appendix L. We may simply note here that a ay appears along with a ‘of? in 
Hy, so that the process of ‘contraction’ (cf chapter 6) will lead to the form 
(0|T'(¢b(x1)eb(x2))|0) of the Dirac propagator, as stated in section 7.2. 

In the same way, the global U(1) invariance (7.22) of the complex scalar 
field may be generalized to a local U(1) invariance incorporating electromag- 
netism. We have 


Lea => Lea F Lic (7.137) 
where i . . o 
Lea = 0,6'0"b -mhe (7.138) 
and (under 0,, > D) 
Line = —ig(G'a"'d — ("G')6) A, + PAM Â Gt (7.139) 


which is the field theory analogue of the interaction in (3.100). The electro- 
magnetic current is 
Jom = —OLin, /OA, (7.140) 


as before, which from (7.139) is 
jen = ig (Gland — (O46!) G) — 2q? Abd. (7.141) 


We note that for the boson case the electromagnetic current is not just q 
times the (number) current No appropriate to the global phase invariance. 
This has its origin in the fact that the boson current involves a derivative, 
and so the gauge invariant boson current must develop a term involving At 
itself, as is evident in (7.141), and as we also saw in the wavefunction case 
(cf equation (2.40)). The full scalar QED Lagrangian is completed by the 
inclusion of Le as before. 
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The application of the formalism of chapter 6 is not completely straight- 
forward in this scalar case. The problem is that Lint of (7.139) involves deriva- 
tives of the fields and, in particular, their time derivatives. Hence the canoni- 
cal momenta will be changed from their non-interacting forms. This, in turn, 
implies that the additional (interaction) term in the Hamiltonian is not just 
—Lint, as in the Dirac case, but is given by (problem 7.15) 


H's = -int — (Â?) GtG. (7.142) 


The problem here is that the Hamiltonian and Pay ae differ by a term which is 
non-covariant (only A° appears).This seems to threaten the whole approach 
of chapter 6. Fortunately, another subtlety rescues the situation. There is 
a second source of non-covariance arising from the time-ordering of terms 
involving time derivatives, which will occur when (7.142) is used in the Dyson 
series (6.42). In particular, one can show (problem 7.16) that 


(0|T (A1p4(a1) 02,6" (a2))|0) 
= 01,02 (0|T(G(x1) 4" (@2))|0) — iguogvod* (a1 — £2) (7.143) 


which also exhibits a non-covariant piece. A careful analysis (Itzykson and 
Zuber 1980, section 6.1.4) shows that the two covariant effects exactly com- 
pensate, so that in the Dyson series we may use H's = =L after all. The 
Feynman rules for charged scalar electrodynamics are given in appendix L. 


= 


7.5 P, C and T in quantum field theory 


We end this chapter by completing the discussion of the discrete symmetries 
which we began in section 4.2, extending it from the single particle (wave- 
function) theory to quantum fields. We begin with the parity transformation. 


7.5.1 Parity 


The algebraic manipulations of section 4.2.1 apply equally well to the equa- 
tions of motion for the quantum field, and we can take over the results by 
replacing a transformed wavefunction such as Wp(a,t) by the corresponding 
transformed field wp (x,t) = Po)(a, t)P-! where P is a unitary quantum field 
operator (which we shall not need to calculate explicitly). Thus we have 


p(x, t) = $(—-a,t) (7.144) 
for the KG and Dirac fields, and 


Ap(a,t) =—A(-a,t), A%(a,t) = A°(—a, t) (7.146) 
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for the electromagnetic fields. In (7.144) - (7.146) a simple choice of phase 
factor has been made. 

There is however one new feature in the quantum field case, which is that 
the commutation or anticommutation relations must be left unchanged by 
the transformation, if it is to be an invariance of the theory. Evidently for P 
the only non-trivial case is the Dirac field, and it is easy to check that the 
anticommutation relations (7.44) and (7.45) are invariant under (7.145). 

Let us see the effect of P on the free particle expansion (7.35). Equation 
(7.145) becomes 


n 3 . p or 


+ Pdi (k)P7u(k, seit ik a} 
l= XO [és(k)Bu(k sje iwtikea 
(2r) V2w <<, E , 


+ di(k)Bu(k, s)eivttik- 2) (7.147) 

Changing k to —k in the second integral and using the spinor properties 
Bu((w,—-k),s) =u(k,s),  Bv((w,—k), s) = —v(k, s) 7.148) 
in the right hand side of (7.147), we obtain the conditions 


Pê (k) PTI = @(w,—-k), Pdt(k)P-! = -di (w, —k) 7.149) 


with similar ones for é! and ds. Since él creates a fermion from the vacuum and 
dt creates its antiparticle, it follows that a fermion and its antiparticle have 
opposite intrinsic parities. Similarly, equation (7.146) shows, when applied 
to the expansion (7.104), that a physical (transverse) photon has negative 
intrinsic parity. 

Turning now to the electromagnetic interaction, it is clear that J (x) = 
qub(x)y"y)(x) has exactly the same transformation properties under P as 
wy"b(x) had — namely J?a (£) is a scalar and 7,,,(«) is a polar vector. Since 
this is also the way A“ transforms, according to (7.146), it follows that the 
interaction —jt Au is parity invariant, as we expect for QED. The scalar 
interaction (7.139) is also parity invariant. 


7.5.2 Charge conjugation 


The discussion of C proceeds similarly, the transformation being represented 
by a unitary quantum field operator C such that 


Ccé¢c! = ¢i (7.150) 
ġġ TI = iit (7.151) 
Cao = Â” (7.152) 
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in the three cases of interest. Note that in terms of the decomposition (7.15) 
of the complex field ¢ into the two real fields ¢; and ¢2, (7.150) reads 


C($1 — ig2)C7* = 1 + ide. (7.153) 


The reader may check (problem 7.17(a)) that the Dirac field anticommutation 
relations are invariant under (7.151). 
Applying (7.150) to the free field expansion (7.16), we easily find 


Ca(k)C-! = b(k), h(k)! = ât (k), (7.154) 


so that particle and antiparticle operators are interchanged. The conditions 
(7.154) are of course consistent with (7.153). It follows that the normally 
ordered H of (7.21) is even under C, while the normally ordered number 
density (7.24) is odd — the ordering being with Bose commutation relations. 
Carrying out the same steps for the Dirac field, and using the spinor relations 
(4.95) and (4.96), we obtain 


Cé,(k)C-! =d(k), Cdt(k)C-! = ĉ (k); (7.155) 


particle and antiparticle operators are again interchanged. We particularly 
note that the Dirac Hamiltonian (7.55) is even under C, while the Dirac 
number operator (7.54) is odd, in both cases after normal ordering with an- 
ticommutation relations (Fermi statistics). The reader may check (problem 


7.17(b)) that the electromagnetic current density qlz) ýla) is odd under 
C, when normally ordered, and so the interaction —j4,, A, is C-invariant. The 
same is true for the KG case, after normal ordering using Bose statistics. 

In section 4.2.2 we introduced self-conjugate (Majorana) spinors. In ex- 
tending that discussion to quantum field theory, it is again convenient to use 
the alternative representation (3.40) for the Dirac matrices, since we can then 
read off the Lorentz transformation properties from the results of section 4.1.2. 
Consider the 4-component Majorana field 


n _ ( -io2XtT (a) 
saute) = ( 2a) J (7.156) 


It is easy to check from (4.19) and (4.42) that the quantity o2y*(«) transforms 
like a ¢-type spinor, and so the construction (7.156) is consistent with Lorentz 
covariance. The C-conjugate field is 


T eee ee — 0 —id2 —io2X(x) E 
huot =i = (42. T (SRG?) =de) esn 
showing that it is self-conjugate. It is clear that the Majorana field has only 
two independent degrees of freedom — those in (x) — in contrast to the Dirac 
field which has four (we could of course have equally well constructed a Ma- 
jorana field using a ¢-type spinor field instead of a x-type one). The latter 
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corresponds physically to fermion and antifermion, spin up and down, but 
the Majorana fermion is the same as its antiparticle. The free field expansion 
corresponding to (7.35) for a Majorana field is 


f Z dk ê u —ik-x al v ikea 
ius) = | Gas 2 a(k)u(k, Aje** + éh(k)u(k, AJe™”]. (7.158) 


The Lagrangian for a free Majorana field may be taken to be ýų(iĝð — 
m)wm, which the reader can rewrite in terms of ¥. For example, the mass 
term is 


—miytim = —mitior2X¥ + Hermitian conjugate. (7.159) 


We note that this expression will vanish unless the components gı andy¥2 
anticommute with each other. 


7.5.3 Time reversal 


In section 4.2.4 we found that the time reversal transformation for the single 
particle theories was not represented by a unitary operator, but rather by the 
product of a unitary operator and the complex conjugation operator. We can 
see that the same must be true in quantum field theory by considering the 
equation of motion (6.18) for a scalar field (for simplicity), in the interaction 
picture: 


Od(ax,t eS 

PAED L itfi, (w, t) (7.160) 
Suppose the field or in the time reversed frame were related to d by a uni- 
tary quantum field operator Ur so that (suppressing the spatial argument) 


Urd(t)Ul, = x(t’). Then applying Ur...U+ to equation (7.160) we would 
T T 


obtain . 
eel?) L iirf drt) (7.161) 
or equivalently : 
Pert) _ irib, rl). (7.162) 


To restore (7.162) to the form (7.160) — i.e. for covariance to hold — would 
require that Ur transforms Ho to — Ho. But this is unacceptable on physical 
grounds, because the eigenvalues of Hp must be positive relative to the vac- 
uum, both before and after the transformation. We must therefore write the 
transformation as 


T = ÛrK (7.163) 


where, as in section 4.2.4, K takes the complex conjugate of ordinary numbers 
and functions (i.e. it replaces i by -i). The operator Ur depends on the field 
involved, but we shall not need to exhibit it explicitly. 
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We must now decide how the fields transform under T. We can be guided 
by our work in section 4.2.4 in the single particle theory, remembering that a 
wavefunction is the vacuum to one particle matrix element of the correspond- 
ing quantum field operator (see Comment (5) in section 5.2.5), and also that 
matrix elements of operators and their time-reversed transforms are related 
by (4.126). In the case of the KG field, for example, let us take in (4.126) 
< 2 | =< 0|, Ô = d(a), and |, >= |a;p > for the state of one ‘a’ particle 
with 4-momentum p. Then (4.126) gives 


plz) =< 0/4(x)|a; E, p >=< Or|T4(x)T~!|a; E, -p >*, (7.164) 


where ¢(2) is the free particle solution exp(—iEt + ip- x)/(2E)'/?. Now in 
section 4.2.4 we found the result dp(a,t) = ¢*(x,—t), for the time-reversed 
solution. This will be consistent with (7.164) if we take, in the quantum field 
case, 

Tel, HT = d(x, -t), (7.165) 


assuming that the vacuum is invariant. Applying (7.165) to the free field 
expansion (4.5) gives 


Tol, t) Tt = 
d?k T A Yt iwt-ik-w P t a—iwt+ik-æ 
ana Ue + Upbl(k)Uhe tis 2) (7.166) 
TT W 
3 
= ọ(x, —t) = J Re + iea (7.167) 
TT W 


Note that the plane wave functions have been complex conjugated in (7.166), 
because T contains K. Changing k to —k in the integral in (7.167), we obtain 
the conditions 


Ura(w,k)U), = â(w,—k), Urb'(o,k)U4 = b (w, —k). (7.168) 


The transformation preserves particle and antiparticle, and reverses the 3- 
momentum in the creation and annihilation operators. 
For the Dirac theory, we take, similarly, 


Tól, t) Î = iaa3%)(x, —t) (7.169) 


as suggested by (4.118). The reader may check that the anticommutation 
relations are left invariant by (7.169). Applying (7.169) to the free field ex- 
pansion (7.35), and taking the spinors to be helicity eigenstates as in section 
4.2.5, we obtain the conditions 


Ûrô (w, k)UL = ĉ (w, —k), Und) (w,k)UL = di (w,—k). (7.170) 


Once again, the 3-momentum has been reversed in the creation and annihila- 
tion operators. 
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Let us check the behaviour of the current density j#, (x) = gh(x)y"b(ax) 
under the transformation (7.169). Recalling that in the standard representa- 
tion iaja3 = Ne, we find 


ear = ey) 
Tjem(#,t)T = qhi (æ, —t)Dga*Lop(a, t) = J nl) t). (7.171) 


This is exactly how A” (x), and hence Â” (x), transforms, and hence the elec- 
tromagnetic interaction jy Ay is T-invariant. The same is true in the KG 
case. 

We may now proceed to look at some simple processes in scalar and spinor 
electrodynamics, in the following two chapters. 


(ee 
Problems 


7.1 Verify that the Lagrangian Ê of (7.1) is invariant (i.e. £(¢1,¢2) = £(¢,, 65) 
under the transformation (7.2) of the fields ($1, ¢2) —> (1, 65). 
7.2 
(a) Verify that, for NY given by (7.23), the corresponding Ny of (7.14) 
reduces to the form (7.24); and that, with H given by (7.21) 


? 


(Ñs, H] = 0. 


(b) Verify equation (7.27). 
7.3 Show that 


[ġ(z1), (z2) =0 for (zı — x2)? < 0 


[Hint: insert expression (7.16) for the ¢’s and use the commutation rela- 
tions (7.18) to express the commutator as the difference of two integrals; in 
the second integral, xı — x2 can be transformed to —(#1 — x2) by a Lorentz 
transformation — the time-ordering of space-like separated events is frame- 
dependent]. 


7.4 Verify that varying yt in the action principle with Lagrangian (7.34) gives 
the Dirac equation. 


7.5 Verify (7.44). 
7.6 Verify equations (7.52) and (7.53). 
7.7 Verify (7.62). 
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7.8 Verify the expression given in (7.64) for X ulk, s)ū(k, s). [Hint: first, 


S 
note that u is a four-component Dirac spinor arranged as a column, while u 
is another four-component spinor but this time arranged as a row because of 
the transpose in the t symbol. So ‘ui’ has the form 


ui (u u2? U3 ua) uŭ uŭ 
U2U1 Uzu2 


u4 


Verify that 
1 0 
1al} 242 
Pot + 6% ae a | 


Similarly, verify the expression for > u(k, s)0(k, s). 


7.9 Verify the result quoted in (7.63) for the Feynman propagator for the 
Dirac field. 


7.10 Verify that if £ = =+ Fy F” — j%, Ay, where Fy, = b Av — p Ap, the 
Euler-Lagrange equations for A, yield the Maxwell form 


A" — OH (3, A”) = jy. 


[Hint: it is helpful to use antisymmetry of F, to rewrite the ‘F - F’ term as 
4i H AV 
zF" A” .] 


7.11 
(a) Show that the Fourier transform of the free-field equation for A, 


(i.e. the one in the previous question with j4, set to zero) is given 
by (7.90). 


(b) Verify (7.94). 


7.12 Show that the equation of motion for A,,, following from the Lagrangian 
Lr of (7.97) is 


A" =0. 


7.13 Verify equation (7.118). 
7.14 Verify equations (7.120), (7.121) and (7.122). 


7.15 Verify the form (7.142) of the interaction Hamiltonian, H’,, in charged 
spin-0 electrodynamics. 
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7.16 Verify equation (7.143). 
7.17 


(a) Check that the anticommutation relations (7.44) and (7.45) are left 
invariant under (7.151). 


(b) Check that the Dirac electromagnetic current density w(a)yb(a) is 
odd under C when normally ordered. [Hint: the normally ordered 


current can be written as iia), yH) (a)]] 
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8 


Elementary Processes in Scalar and Spinor 
Electrodynamics 


8.1 Coulomb scattering of charged spin-0 particles 


We begin our study of electromagnetic interactions by considering the sim- 
plest case, that of the scattering of a (hypothetical) positively charged spin-0 
particle ‘st’ by a fixed Coulomb potential, treated as a classical field. This 
will lead us to the relativistic generalization of the Rutherford formula for 
the cross section. We shall use this example as an exercise to gain familiarity 
with the quantum field-theoretic approach of chapter 6, since it can also be 
done straightforwardly using the ‘wavefunction’ approach familiar from non- 
relativistic quantum mechanics, when supplemented by the work of chapter 3. 
We shall also look at ‘s~’ Coulomb scattering, to test the antiparticle prescrip- 
tions of chapter 3. Incidentally, we call these scalar particles s* to emphasize 
that they are not to be identified with, for instance, the physical pions 7~, 
since the latter are composite (qq) systems, and hence their interactions are 
more complicated than those of our hypothetical ‘point-like’ s+ (as we shall 
see in section 8.4). No point-like charged scalar particles have been discovered, 
as yet. 


8.1.1 Coulomb scattering of st (wavefunction approach) 


Consider the scattering of a spin-0 particle of charge e and mass M, the ‘st’, in 
an electromagnetic field described by the classical potential A”. The process 
we are considering is 


s*(p) + s*(p’) (8.1) 


as shown in figure 8.1, where p and p’ are the initial and final 4-momenta 
respectively. The appropriate potential for use in the KG equation has been 
given in section 3.5: 


Vka = ie(0,, A” + A#O,) — AP. (8.2) 


As we shall see in more detail as we go along, the parameter characterizing 
each order of perturbation theory based on this potential is found to be e?/47. 
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FIGURE 8.1 
Coulomb scattering of s+. 


In natural units (see appendices B and C) this has the value 
a = e j4r x = (8.3) 
137 


for the elementary charge e. a is called the fine structure constant. The small- 
ness of a is the reason why a perturbation approach has been very successful 
for QED. 
To lowest order in œ we can neglect the e?A? term and the perturbing 
potential is then 
V =ie(0,,A" + A“O,,). (8.4) 


For a scattering process we shall assume! the same formula for the transition 
amplitude as in non-relativistic quantum mechanics (NRQM) time-dependent 
perturbation theory (see appendix A, equations (A.23) and (A.24)): 


Age = “i fate Vo (8.5) 


where ¢ and ¢’ are the initial and final state free-particle solutions. The latter 
are (recall equation (3.11)) 


ġ = Nee (8.6) 
on = Nlevip s 


and we shall fix the normalization factors later. Inserting the expression for 
V into (8.5), and doing some integration by parts (problem 8.1), we obtain 


Ast =i f d'a {iel (8,6) — (0,0"*)6]} A". (8.8) 


The expression inside the braces is very reminiscent of the probability current 
expression (3.20). Indeed we can write (8.8) as 


Aas J diz jë + (a)Ay(2) (8.9) 


1 Justification may be found in chapter 9 of Bjorken and Drell (1964). 
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where 
Jem s+ (£) = iel ap — (B*9"") 4) (8.10) 


can be regarded as an electromagnetic ‘transition current’, analogous to the 
simple probability current for a single state. In the following section we shall 
see the exact meaning of this idea, using quantum field theory. Meanwhile, 
we insert the plane-wave free-particle solutions (8.6) and (8.7) for ¢ and ¢’ 
into (8.10) to obtain 


je gt (2) = NN'e(p + p') eO- e (8.11) 
so that (8.9) becomes 


A+ = —iNN’ J dtz e(p + p') se PP.) AY (T). (8.12) 


In the case of Coulomb scattering from a static point charge Ze (e > 0), 
the vector potential A” is given by 


Ze 
Ao = A=0. 8.13 
Ar |ao| ied) 
Inserting (8.13) into (8.12) we obtain 
i j i(p—p')-x 
Ast = -iNN'Ze?(E+ BY foie at [ ae ee (8.14) 
TT 


The initial and final 4-+momenta are 
p=(E,p) p =(F',p") 


with E = \/M?+ p?, E' = y M2 +p”. The first (time) integral in (8.14) 
gives an energy-conserving -function 2rô(E — E’) (see appendix E), as is 
expected for a static (non-recoiling) scattering centre. The second (spatial) 
integral is the Fourier transform of 1/47|a|, which can be obtained from (1.13), 
(1.26) and (1.27) by setting my = 0; the result is 1/q? where q = p—p’. Hence 


Ze? 
g 


—i(2r)ð(E — E')V;+ (cf equation (A.25)) (8.16) 


As = iNN'2rô(E — E') 2E (8.15) 


where in (8.15) we have used E = E’ in the matrix element. This is in the 
standard form met in time-dependent perturbation theory (cf equations (A.25) 
and (A.26)). 
The transition probability per unit time is then (appendix H, equation 
(H.18)) 
Èa = 2n|V+ oE’) (8.17) 
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where p(E’) is the density of final states per energy interval dE’. This will 
depend on the normalization adopted for ¢,¢@’ via the factors N,N’. We 
choose these to be unity, which means that we are adopting the ‘covariant’ 
normalization of 2E particles per unit volume. Then (cf equation (H.22)) 


1\2 d|p’| 
E' dE’ = |p | paced Elan : 
pear = PES (8.18) 
Using E’ = (M? + p’”)!/2 one easily finds 
‘| dQ 
(zr) = BIS (8.19) 


Note that this differs from equation (H.22) since here we are using relativistic 
kinematics. 

To obtain the cross section, we need to divide P,+ by the incident flux, 
which is 2|p| in our normalization. Hence 


do = (4Z7e4 F? /16nq*) dQ. (8.20) 


Finally, since q? = (p — p’)? = 4|p|? sin? 6/2 (cf section 1.3.4) where 0 is the 
angle between p and p’, we obtain 


do F? 1 

— = (Za)? —_— 8.21 
) 4|p|* sinf 6/2 eet) 

This is the Rutherford formula with relativistic kinematics, showing the char- 

acteristic sin~* 0/2 angular dependence (cf figure 1.8). This deservedly famous 

formula will serve as a ‘reference point’ for all the subsequent calculations in 

this chapter, as we proceed to add in various complications, such as spin, re- 


coil and structure. The non-relativistic form may be retrieved by replacing EF 
by M. 


8.1.2 Coulomb scattering of st (field-theoretic approach) 


We follow steps closely similar to those in section 6.3.1, making use of the 
result quoted in section 7.4, that the appropriate interaction Hamiltonian for 
use in the Dyson series (6.42) is Hi, = — Ĝin where Lint is given by (7.139), 
with q =e. As in the step from (8.2) to (8.4) we discard the e? term to first 
order and use 


Hy (a) = iel («)O"9(a) — (0"6"(x))d(a)) A, (2). (8.22) 
Equation (8.22) can be written as Fens Ap where 
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Note that the field A, is not quantized: it is being treated as an ‘external’ 
classical potential. The expansion for the field ¢ is given in (7.16). As in 
(6.48), the lowest-order amplitude is 


A+ = —ilst, p'| T dx Hi(x)|s* ,p) (8.24) 
where (cf (6.49)) 


Is*,p) = V2Eâ (p)|0). (8.25) 


We are, of course, anticipating in our notation that (8.24) will indeed be the 
same as (8.12). The required amplitude is then 


Age = =i f da (8t, ptn o(a)l8* p) Au (0). (8.26) 


Using the expansion (7.16), the definition (8.25) and the vacuum conditions 
(7.30), and following the method of section 6.3.1, it is a good exercise to check 
that the value of the matrix element in (8.26) is (problem 8.2) 


(tip Iiu = epp jhe ee, (8.27) 


This is exactly the same as the expression we obtained in (8.11) for the wave 
mechanical transition current in this case, using the normalization N = N’ = 
1, which is consistent with the field-theoretic normalization in (8.25). Thus 
our wave mechanical transition current is indeed the matrix element of the 
field-theoretical electromagnetic current operator: 


Fem gt (©) = 6", p lf s(2)ls*, p). (8.28) 


Combining all these results, we have therefore connected the ‘wavefunction’ 
amplitude and the ‘field-theory’ amplitude via 


Ag = Wi f ateit one) 
= =i f ata (st rity a(@l5*,P)Ad(@). (8.29) 
We note that because of the static nature of the potential, and the non- 


covariant choice of A” (only A° Æ 0), our answer in either case cannot be 
expected to yield a Lorentz invariant amplitude. 


8.1.3 Coulomb scattering of s— 
The physical process is (figure 8.2(a)) 


s~ (p) +s (p’) (8.30) 
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Sa soe `S gt stp? 
a F X Mix. , we 
p "i p" =p A —p' 
(a) (b) 
FIGURE 8.2 


Coulomb scattering of s~: (a) the physical process with antiparticles of pos- 
itive 4-momentum, and (b) the related unphysical process with particles of 
negative 4-momentum, using the Feynman prescription. 


where, of course, E and E’ are both positive (E = (M?+p?)!/? and similarly 
for E’). Since the charge on the antiparticle s~ is —e, the amplitude for this 
process can, in fact, be immediately obtained from (8.12) by merely changing 
the sign of e. Because of the way e and the 4-momenta p and p’ enter (8.12), 
however, this in turn is the same as letting p => —p' and p' + —p: this 
changes the sign of the ‘e(p+p’),,’ part as required, and leaves the exponential 
unchanged. Hence we see in action here (admittedly in a very simple example) 
the Feynman interpretation of the negative 4-momentum solutions, described 
in section 3.4.4: the amplitude for s~ (p) + s~ (p’) is the same as the amplitude 
for st(—p’) + st(—p). The latter process is shown in figure 8.2(b). 


The same conclusion can be derived from the field-theory formalism. In 
this case we need to evaluate the matrix element 


(s~,p'libm,s(#)|8 ,P), (8.31) 


where the same Joms of equation (8.23) enters: d of (7.16) contains the an- 
tiparticle operator too! It is again a good exercise to check, using 


Is”, p) = V2E 6'(p)|0) (8.32) 


and remembering to normally order the operators in ae that (8.31) is given 
by the expected result, namely, (8.27) with e + —e (problem 8.3). 


Since the matrix elements only differ by a sign, the cross sections for s* 
and s~ Coulomb scattering will be the same to this (lowest) order in a. 
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FIGURE 8.3 
Coulomb scattering of e7. 


8.2 Coulomb scattering of charged spin-4 particles 


8.2.1 Coulomb scattering of e~ (wavefunction approach) 


We shall call the particle an electron, of charge —e(e > 0) and mass m; note 
that by convention it is the negatively charged fermion that is the ‘particle’, 
but the positively charged boson. The process we are considering is (figure 8.3) 


e7 (k, s) > e7 (k', s") (8.33) 


where k, s are the 4-momentum and spin of the incident e~, and similarly for 
k', s', with k = (E, k) and E = (m? + k’)!/? and similarly for k’. 

The appropriate potential to use in the Dirac equation has been given in 
section 3.5: 


(8.34) 


x 0 ; 
P = -e401 + ea- A= e ( A d A) 


ao:-A_ A? 


for a particle of charge —e. This potential is a 4 x 4 matrix and to obtain an 
amplitude in the form of a single complex number, we must use w' instead of 
w* in the matrix element. The first-order amplitude (figure 8.3) is therefore 


Ae- = =i f ateut (Ws Youths) (8.35) 


where s and s’ label the spin components. The spin labels are necessary 
since the spin configuration may be changed by the interaction. In (8.35), 
w and wy” are free-particle positive-energy solutions of the Dirac equation, 
as in (3.74), with u given by equation (3.73) and normalized to utu = 2E, 
E = (m? + k?) 

The Lorentz properties of (8.35) become much clearer if we use the y- 
matrix notation of problem 4.3. For convenience we re-state the definitions 
here: 


Y=B (PP = (8.36) 
y= bai (v1)? =-1 i=1,2,3. (8.37) 
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The Dirac equation may then be written (problem 4.3) as 
(id — m)y = 0 (8.38) 


where the ‘slash’ notation introduced in (7.59) has been used (if = iy"0,). 
Defining Y = ~t7°, (8.35) becomes 


Aes. se J d*s (—e@'(a)y"b(2)) Ay (2) (8.39) 
= “i f ata ikae- (x)A,,(«) (8.40) 


where we have defined an electromagnetic transition current for a negatively 
charged fermion: 


Fem e- (2) = —e'(x)y"v(z), (8.41) 
exactly analogous to the one for a positively charged boson introduced in 
section 8.1.1. We know from section 4.1.2 that y’y"w is a 4-vector, showing 
that A,- of (8.40) is Lorentz invariant. 

Inserting free-particle solutions for ọ and 7!" in (8.41), we obtain 
jt (x) = —eu(k’, s!)yu(k, s)e EE)? (8.42) 


Jem,e- 


so that (8.39) becomes 
Ae- = =i fate (—ew'y#uek-*) 2) A (x) (8.43) 


where u = u(k, s) and similarly for u’. Note that the u’s do not depend on z. 
For the case of the Coulomb potential in equation (8.13), Ae- becomes 


i / Ze? it 
Ae- =i2nd(E-E F u (8.44) 


just as in (8.15), where q = k — k’ and we have used w'7° = u't. Comparing 
(8.44) with (8.15), we see that (using the covariant normalization N = N’ = 1) 
the amplitude in the spinor case is obtained from that for the scalar case by 
the replacement ‘2E — uw and the sign of the amplitude is reversed as 
expected for e~ rather than s* scattering. 

We now have to understand how to define the cross section for particles 
with spin and then how to calculate it. Clearly the cross section is proportional 
to |A.-|?, which involves |ut(k’, s’)u(k, s)|? here. Usually the incident beam 
is unpolarized, which means that it is a random mixture of both spin states 
s (‘up’ or ‘down’). It is important to note that this is an incoherent average, 
in the sense that we average the cross section rather than the amplitude. 
Furthermore, most experiments usually measure only the direction and energy 
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of the scattered electron and are not sensitive to the spin state s’. Thus what 
we wish to calculate, in this case, is the unpolarized cross section defined by 


de = 4 (dom + doy, + doy +do,) 


= $) doy. (8.45) 


where dos s x |u'(k’,s’)u(k,s)|?. In (8.45), we are averaging over the two 
possible initial spin polarizations and summing over the final spin states arising 
from each initial spin state. 

It is possible to calculate the quantity 


S=} juul? (8.46) 


by brute force, using (3.73) and taking the two-component spinors to be, say, 


pal F (8.47) 


One finds (problem 8.4) 
S = (2E)? (1 — v? sin? 0/2) (8.48) 


where v = |k|/E is the particle’s speed and 0 is the scattering angle. If we 
now recall that (i) the matrix element (8.44) can be obtained from (8.15) by 
the replacement ‘2E — u'w and (ii) the normalization of our spinor states 
is the same (‘p = 2E’) as in the scalar case, so that the flux and density of 
states factors are unchanged, we may infer from (8.21) that 


E? (1—v? sin? 6/2) (8.49) 


A\kl¢ sin? @/2 ` 


This is the Mott cross section (Mott 1929). Comparing this with the basic 
Rutherford formula (8.21), we see that the factor (1—v? sin? 0/2) (which comes 
from the spin summation) represents the effect of replacing spin-0 scattering 
particles by spin-4 ones. 

Indeed, this factor has an important physical interpretation. Consider the 
extreme relativistic limit (v > 1,m — 0), when the factor becomes cos? 0/2, 
which vanishes in the backward direction 0 = 7. This may be understood as 
follows. In the m —> 0 limit, it is appropriate to use the representation (3.40) 
of the Dirac matrices and, in this case equations (4.14) and (4.15) show that 
the Dirac spinor takes the form 


u= by (8.50) 
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where ur and uz have positive and negative helicity respectively. The spinor 
part of the matrix element (8.44) then becomes uktug+ut uL, from which it is 
clear that helicity is conserved: the helicity of the u’ spinors equals that of the 
u spinors; in particular there are no helicity mixing terms of the form ukur or 
ul tur. Consider then an initial state electron with positive helicity, and take 
the z-axis to be along the incident momentum. The z-component of angular 
momentum is then +h. Suppose the electron is scattered through an angle 
of m. Since helicity is conserved, the scattered electron’s helicity will still be 
positive, but since the direction of its momentum has been reversed, its angular 
momentum along the original axis will be —t. Hence this configuration is 
forbidden by angular momentum conservation — and similarly for an incoming 
negative helicity state. The spin labels s’,s in (8.46) can be taken to be 
helicity labels and so it follows that the quantity S must vanish for 0 = m in 
the m — 0 limit. The ‘R’ and ‘L’ states are mixed by a mass term in the Dirac 
equation (see (4.14) and (4.15)) and hence we expect backward scattering to 
be increasingly allowed as m/E increases (recall that v = (1 — m?/E?)'/? so 
that 1 — v? sin? 6/2 = cos? 0/2 + (m?/E?) sin? 6/2). 


8.2.2 Coulomb scattering of e` (field-theoretic approach) 


Once again, the interaction Hamiltonian has been given in section 7.4, namely 
Hp = -eby hA, = Jm, eAn (8.51) 


where the current operator i is just —ewy"~ in this case. The lowest-order 
amplitude is then 


A = ilek, f ata Rola) ky) (8.52) 


II 


“i f ata (e5, k, s'n e(0)leT, kss) Aula). (8.53) 


With our normalization, and referring to the fermionic expansion (7.35), the 
states are defined by 

le~, k, s} = V2E¢!(k)|0) (8.54) 
and similarly for the final state. We then find (problem 8.5) that the current 
matrix element in (8.53) takes the form 


(e7, k', s'\jtne(x)le-, k, s) = —eū'gytue ilkka jy (T) (8.55) 


= Jem,e- 


exactly as in (8.42). Thus once again, the ‘wavefunction’ and ‘field-theoretic’ 
approaches have been shown to be equivalent, in a simple case. 


8.2.3 Trace techniques for spin summations 


The calculation of cross sections involving fermions rapidly becomes laborious 
following the ‘brute force’ method of section 8.2.1, in which the explicit forms 
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for u and u’? were used. Fortunately we can avoid this by using a powerful 
labour-saving device due to Feynman, in which the y’s come into their own. 
We need to calculate the quantity S given in (8.46). This will turn out to 
be just the first in a series of such objects. With later needs in mind, we shall 
here calculate a more general quantity than (8.46), namely the lepton tensor 


LY’ (k',k) = 5 =a \y"u(k, 8) [a(k’, s’)y’u(k, s)]* (8.56) 


1 z F 
= 2e2 soe, k', s'|Jbm,e(0)le7, k, 5) (e kis S'Jem,e(O)le- k, s)“. (8.57) 


Clearly this will be relevant to the more general case in which A” contains 
non-zero spatial components, for example. For our present application, we 
shall need only L°. 

We first note that L”” is correctly called a tensor (a contravariant second- 
rank one, in fact — see appendix D), because the two ‘tyu, uy”u’ factors are 
each 4-vectors, as we have seen. (We might worry a little over the complex 
conjugation of the second factor, but this will disappear after the next step.) 
Consider therefore the factor [t(k’, s’)y’u(k, s)]*. For each value of the index 
v, this is just a number (the corresponding component of the 4-vector), and 
so it can make no difference if we take its transpose, in a matrix sense (the 
transpose of a 1 x 1 matrix is certainly equal to itself!). In that case the 
complex conjugate becomes the Hermitian conjugate, which is: 


[u(k’, s’) ulk, s)? = ul(k, syty tulk, s’) (8.58) 
A (8.59) 

since (problem 8.6) 
P =g (8-60) 


and 7° = 7°. Thus L“” may be written in the more streamlined form 


=5 = 2 Lal \yHul(k, s)u(k, sy ulk, s") (8.61) 


which is, moreover, evidently the (tensor) product of two 4-vectors. However, 
there is more to this than saving a few symbols. We have seen the expression 


S ulk, s)ti(k, s) (8.62) 


S 


before! (See (7.64) and problem 7.8.) Thus we can replace the sum (8.62) 
over spin states ‘s’ by the corresponding matrix (f + m): 


Y=} YO talk, s) aol + m)ay(7”)yeus(k’,s’) (8-63) 
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where we have made the matrix indices explicit, and summation on all repeated 
matriz indices is understood. In particular, note that every matrix index is 
repeated, so that each one is in fact summed over: there are no ‘spare’ indices. 
Now, since we can reorder matrix elements as we wish, we can bring the us 
to the front of the expression, and use the same trick to perform the second 
spin sum: 


2 ulk, Nūalk’, s") = (K + m)sa. (8.64) 


Thus L#” takes the form of a matrix product, summed over the diagonal 
elements: 


L” = 5h + m)l oak + m)e a)y (8.65) 
= ES Kh +m) (k+ myo (8.66) 
ô 


where we have explicitly reinstated the sum over 6. The right-hand side of 
(8.66) is the trace (i.e. the sum of the diagonal elements) of the matrix formed 
by the product of the four indicated matrices: 


LH” = ITr[( E + m)” (E +m)”. (8.67) 


Such matrix traces have some useful properties which we now list. Denote 
the trace of a matrix A by 


TA =Y Ai. (8.68) 


Consider now the trace of a matrix product, 
B) = S > Aig By (8.69) 
ij 


where we have written the summations in explicitly. We can (as before) freely 
exchange the order of the matrix elements A;; and B;;, to rewrite (8.69) as 


B) = Xo By Aij- (8.70) 
ij 


But the right-hand side is precisely Tr(BA); hence we have shown that 
Tr(AB) = Tr(BA). (8.71) 
Similarly it is easy to show that 
Tr(ABC) = Tr(CAB). (8.72) 


We may now return to (8.67). The advantage of the trace form is that we 
can invoke some powerful results about traces of products of y-matrices. Here 
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we shall just list the trace ‘theorems’ that we shall use to evaluate L“”: more 
complete statements of trace theorems and y-matrix algebra, together with 
proofs of these theorems, are given in appendix J . 

We need the following results: 


(i) Trl =4 (8.73) 
(ii) Tr (odd number of y’s) = 0 (8.74) 
(iii) Tr(4p) = 4(a- b) (8.75) 
(iv) Tr(¢bfd) = 4[(a-B)(c-d) + (a-d): c) — (a-c) d)]. (8-76) 
Then 
Te[(K +m)” (E+ m)y] = T(K ky”) + mTr(y ky”) 


+ mTE(K' HY”) +m? Tr(a”) (8-77) 
The terms linear in m are zero by theorem (ii), and using (iii) in the form 
Tr(Yu yv )a” b” = 4ga” b” = 4a - b (8.78) 
and (iv) in a similar form, we obtain (problem 8.7) 


LY = FTE + m) (+ my") = DIR RY + RRE — (gh) + 2g 
(8.79) 
In the present case we simply want L°°, which is found to be (problem 7.9) 


L” = 4E?(1 — v? sin? 6/2) (8.80) 


where v = |k|/E, just as in (8.48). 


8.2.4 Coulomb scattering of et 


The physical process is 
et (k,s) 3 et (k’, 3’) (8.81) 


where, as usual, we emphasize that E and E’ are both positive. In the wave- 
function approach, we saw in section 3.4.4. that, because p > 0 always for a 
Dirac particle, we had to introduce a minus sign ‘by hand’, according to the 
rule stated at the end of section 3.4.4. This rule gives us, in the present case, 


amplitude (et (k, s) — et (k’, s’)) 
= —amplitude (e7 (—k’, —s’) + e7 (—k, —s)). (8.82) 


Referring to (8.43), therefore, the required amplitude for the process (8.81) is 


Act =i J dî (eD(k, s)y"u(k', se *-*)*) A (a) (8.83) 
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since the ‘v’ solutions have been set up precisely to correspond to the ‘—k, —s’ 
situation. In evaluating the cross section from (8.83), the only difference from 
the e~ case is the appearance of the spinors ‘v’ rather than ‘u’; the lepton 
tensor in this case is 


LH” = ITeC — m)" (E — my") (8.84) 


using the result (7.64) for >, v(k,s)v(k,s). Expression (8.84) differs from 
(8.67) by the sign of m and by k © k’, but the result (8.79) for the trace 
is insensitive to these changes. Thus the positron Coulomb scattering cross 
section is equal to the electron one to lowest order in œ. 

In the field-theoretic approach, the same interaction Hamiltonian Ht, 
which we used for e~ scattering will again automatically yield the et ma- 
trix element (recall the discussion at the end of section 8.1.3). In place of 
(8.53), the amplitude we wish to calculate is 


As = / dtc (et, , s!|J#, (a)le*, k, 8) Ay(2) 


II 


-i fate (et, k’, s'| — edi(x)y"b(x) let, k,s)A,,(x) (8.85) 
where, referring to the fermionic expansion (7.35), 
let, k, s) = V2Ed!(k)|0), (8.86) 


and similarly for the final state. In evaluating the matrix element in (8.85) we 
must again remember to normally order the fields, according to the discussion 
in section 7.2. Bearing this in mind, and inserting the expansion (7.35), one 
finds (problem 8.9) 

(tH, 8'ibne(a)let, ks) = +et(k,s)y"u(h’, se) (8.87) 
= ieee (a) (8.88) 


just as required in (8.83). Note especially that the correct sign has emerged 
naturally without having to be put in ‘by hand’, as was necessary in the 
wavefunction approach when applied to an antifermion. 

We are now ready to look at some more realistic (and covariant) processes. 


DM 
8.3 e st scattering 
8.3.1 The amplitude for e~st > e~st 


We consider the two-body scattering process 


e (k,s)+s*(p) 3 e (k’, 8’) +8 (p’) (8.89) 
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FIGURE 8.4 
est scattering amplitude. 


where the 4-momenta and spins are as indicated in figure 8.4. How will the e7 
and s* interact? In this case, there is no ‘external’ classical electromagnetic 
potential in the problem. Instead, each of e~ and s*, as charged particles, 
act as sources for the electromagnetic field, with which they in turn inter- 
act. We can picture the process as one in which each particle scatters off 
the ‘virtual’ field produced by the other (we shall make this more precise in 
comment (2) after equation (8.102)). The formalism of quantum field theory 
is perfectly adapted to account for such effects, as we shall see. It is very 
significant that no new interaction is needed to describe the process (8.89) 
beyond what we already have: the complete Lagrangian is now simply the 
free-field Lagrangians for the spin-4 e7, the spin-0 st and the Maxwell field, 
together with the sum of the lowest order scalar electromagnetic interaction 
Hamiltonian of (8.22), and the Dirac interaction Hamiltonian of (7.135) with 
q = —e. The full interaction Hamiltonian is then 


H'(x) = fie(gt(x)0"d(a) — Ir$ (x) G(x) — elz) ýl) Ân) (8.90) 
= (Gta slz) + Fein e(@)) Au (2) 


where the ‘total current’ in (8.91) is just the indicated sum of the ¢ (scalar) 
and 7 (spinor) currents. This H’ must now be used in the Dyson expansion 
(6.42), in a perturbative calculation of the e~st + e~st amplitude. 

Note now that, in contrast to our Coulomb scattering ‘warm-ups’, the elec- 
tromagnetic field is quantized in (8.90). We first observe that, since there are 
no free photons in either the initial or final states in our process e~st > e7s™, 
the first-order matrix element of H’ must vanish (as did the corresponding 
first-order amplitude in AB — AB scattering, in section 6.3.2). The first 
non-vanishing scattering processes arise at second order (cf (6.74)): 


A = GE [fates ates (oe Wa TER (es) (aaa ENO) 
x (16E, Ep Ey Ep')*/. (8.92) 


Just as for AB > AB and the C field in the ‘ABC’ model (cf (6.81)), as far 
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as the Â, operators in (8.92) are concerned the only surviving contraction is 
(0/7 (Aj (z1)Â, (z2))10) (8.93) 


which is the Feynman propagator for the photon, in coordinate space. As 
regards the rest of the matrix element (8.92), since the @’s and ĉ’s commute 
the ‘st’ and ‘e~’ parts are quite independent, and (8.92) reduces to 


(=i)? 
2 


J dzi dfx {(s+, p'|Jtn s(£1)ls+, p) (OIT (Â, (a1) A, (x2)10) 
x (e7, k’, 8! |Peme(t2)|e k,s) + (a1 © a)}. (8.94) 


But we know the explicit form of the current matrix elements in (8.94), from 
(8.27) and (8.55). Inserting these expressions into (8.94), and noting that the 
term with zı © £2 is identical to the first term, one finds (cf (6.102) and 
problem 8.10) 


Ag-st = i(2m)*54(p or k= p a k')Me-s+ (8.95) 
where (using the general form (7.122) of the photon propagator) 


iMe-s+ = (-i)?(e(p +p')”) pe Se 


x (—eu(k’, s’)y’u(k, s)) (8.96) 


(i), (p, p') —— i(k’) (8.97) 


and q = (k — k') = (p' — p). We have introduced here the ‘momentum-space’ 
currents 
ji (p,p) = elp + p')“ (8.98) 
and 
jE (k, k") = —eū(k', s')y"u(k, s) (8.99) 


shortening the notation by dropping the ‘em’ suffix, which is understood. 
Before proceeding to calculate the cross section, some comments on (8.97) 
are in order: 


Comment (1) 


The j”, (p, p’) and j¥_ (k, k’) in (8.98) and (8.99) are the momentum-space ver- 
sions of the x-dependent current matrix elements in (8.27) and (8.55); they are, 
in fact, simply those matrix elements evaluated at x = 0. The x-dependent 
matrix elements (8.27) and (8.55) both satisfy the current conservation equa- 
tions 0,j"(x) = 0 as is easy to check (problem 8.11). Correspondingly, it 
follows from (8.98) and (8.99) that we have 


duji (PsP) = IuJe-(k, k') = 0 (8.100) 
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where q = p' — p = k — k’, and we have used the mass-shell conditions p? = 
p? = M?, ku = mu, Ku’ = mw’; the relations (8.100) are the momentum- 
space versions of current conservation. The €-dependent part of the photon 
propagator, which is proportional to qq’, therefore vanishes in the matrix 
element (8.97). This shows that the amplitude is independent of the gauge 
parameter € — in other words, it is gauge invariant and proportional simply 
to 


mw Juv w 
ECTE (8.101) 


Comment (2) 


The amplitude (8.97) has the appealing form of two currents ‘hooked together’ 
by the photon propagator. In the form (8.101), it has a simple ‘semi-classical’ 
interpretation. Suppose we regard the process e~st — e~s* as the scattering 
of the e~, say, in the field produced by the st (we can see from (8.101) that 
the answer is going to be symmetrical with respect to whichever of e~ and s* 
is singled out in this way). Then the amplitude will be, as in (8.43), 


Ae-s+ = —i f dte j7- (k, k'et A, (a) (8.102) 


where now the classical field A, (x) is not an ‘external’ Coulomb field but the 
field caused by the motion of the s+. It seems very plausible that this A, (x) 
should be given by the solution of the Maxwell equations (2.22), with the 
jvem(x) on the right-hand side given by the transition current (8.11) (with 
N = N' = 1) appropriate to the motion s+ (p) — s*(p’): 


AY — P(O" Ap) = j% (2) (8.103) 


where , 
jh (x) = e(p + pyre PP), (8.104) 


Equation (8.103) will be much easier to solve if we can decouple the compo- 
nents of A” by using the Lorentz condition 0“ A, = 0. We are aware of the 
problems with this condition in the field-theory case (cf section 7.3.2) but we 
are here treating A” classically. Although A” is not a free field in (8.103), it is 
easy to see that we may consistently take 0“ A, = 0 provided that the current 
is conserved, 0, j¥, (x) = 0, which we know to be the case. Thus we have to 
solve 


A’ (x) = e(p+ p')’e PP) =, (8.105) 


Noting that 


eip—p')-® — (p — p!)2e-ip-P')-# (8.106) 


we obtain, by inspection, 


1 : , 
A” (x)= a doyle ee ie (8.107) 
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FIGURE 8.5 

Feynman diagram for e~s* scattering in the one-photon exchange approxi- 

mation. 

where q = p’ — p. Inserting this expression into the amplitude (8.102) we find 
Ae-s+ = i(2m)*54(p sip k= p = KIM =t (8.108) 


where 


iMe-s+ = jÉ (pp) ESE (kk) (8.109) 
exactly as in (8.97) for € = 1 (the gauge appropriate to ‘8, A” = 0’). 


Comment (3) 


From the work of chapter 6, it is clear that we can give a Feynman graph 
interpretation of the amplitude (8.109), as shown in figure 8.5, and set out 
the corresponding Feynman rules: 


(i) At a vertex where a photon is emitted or absorbed by an s* particle, 
the factor is —ie(p + p')” where p,p’ are the incident and outgoing 
4-momenta of the s*; the vertex for s~ has the opposite sign. 


(ii) At a vertex where a photon is emitted or absorbed by an e7, the 
factor is iey“(e > 0); for an et it is —iey“. (This and the previous 
rule arise from associating one ‘(—i)’ factor in (8.94) or (8.97) with 
each current.) 


(iii) For each initial state fermion line a factor u(k,s) and for each fi- 
nal state fermion line a factor u(k’,s’); for each initial state an- 
tifermion a factor U(k, s) and for each final state antifermion line a 
factor v(k’, s") (these rules reconstruct the et Coulomb amplitudes 
of section 8.2.4). 


(iv) For an internal photon of 4-momentum q, there is a factor —ig,,/q? 
in the gauge € = 1. 
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v) Multiplying these factors together gives the quantity iM; multi- 
ying g 8 y 
plying the result by an overall 4-momentum-conserving 6-function 
factor (27)4(p’ +k’ +---—p-— k-— ---) gives the quantity A. 


Comment (4) 


We know that our amplitude is proportional to 


u Guy . 
je, Sue ge. (8.110) 
q 
Choosing the coordinate system such that q = (q°,0,0,|q|), the current con- 
servation equations q: Js+ = q' je- = 0 read: 


J? = a° 3° / la] (8.111) 


for both currents. Expression (8.101) can then be written as 


(js+de- + jA) + GH -— Ge) 
= (jorde- + jaji) + hi- (8.112) 


using (8.111). The first term may be interpreted as being due to the exchange 
of a transversely polarized photon (only the 1,2 components enter, perpen- 
dicular to q). For real photons q? — 0, so that this term will completely 
dominate the second. The latter, however, must obviously be included when 
q? # 0, as of course is the case for this virtual y (cf section 6.3.3). We note 
that the second term depends on the 3-momentum squared, q?, rather than 
the 4-momentum squared q”, and that it involves the charge densities j°, and 
Je Referring back to section 7.1, we can interpret it as the instantaneous 
Coulomb interaction between these charge densities, since 


f zear = [ect /r = 4r/q’. (8.113) 


Thus, in summary, the single covariant amplitude (8.109) includes contribu- 
tions from the exchange of transversely polarized photons and from the fa- 
miliar Coulomb potential. This is the true relativistic extension of the static 
Coulomb results of (8.15) and (8.44). 


8.3.2 The cross section for e~st — e~st 
The invariant amplitude Me-s+ (s, s") for our process is given by (8.109) as 
Me-st (8, 8’) = eti(k’, 8')y"u(k, 8)(—guv/@)e(p + p)” (8.114) 


where we have now included the spin dependence of the amplitude M,-,+ in 
the notation. The steps to the cross sections are now exactly as for the spin-0 
case (section 6.3.4), as modified by the spin summing and averaging already 


240 8. Elementary Processes in Scalar and Spinor Electrodynamics 


met in sections 8.2.1 and 8.2.3, particularly the latter. The cross section for 
the scattering of an electron in spin state s to one in spin state s’ is (cf (6.110)) 


A sArys ak, TAE 
doss TH |Me- st (S, S N? (27) ô (k +p k p) 
1 dk dp 

Naa Gar 1 

* Qe Qu! QE! (8115) 
where we have defined 
ke = (w, k) k” = (w, k’) 
p“ = (E,p) pl =(E’,p’). (8.116) 


For the unpolarized cross section we are required, as in (8.46), to evaluate 
the quantity 


ES Mest (s,s) = (5) EDO a(k, s'jytulk, salk, s) ulk, 8!) 


s,s! g s,s! 
x (pt p )u(pt+p')y (8.117) 
= (5) L (k, K')T (p p’) (8.118) 


where the boson tensor T,» is just (p + p’),(p +p’), and the lepton tensor 
L#” has been evaluated in (8.79). Using q? = (k — k’)? = 2m? — 2k - k', the 
expression (8.79) can be rewritten as 


LPY (k, k") = 2[k'" k” + k” RH + (g?/2) gH”). (8.119) 
We then find (problem 8.12) 
LYT, = 8[2(p- k)(p- k') + (q?/2)M?] (8.120) 


since k’ - p' = k- p and k- p' = k' - p from 4-momentum conservation, and 
p? = p? = M? (we are using m for the e~ mass and M for the s+ mass). 

We can now give the differential cross section in the CM frame by taking 
over the formula (6.129) with 


IM]? -7 DD |Me-s+ (s, s)|? 


s,s! 


so as to obtain 


da 2a? , 5 > 
(sn)... mep eee ee) (8.121) 


where a = e?/4r and W? = (k +p)?. 
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FIGURE 8.6 
Two-body scattering in the ‘laboratory’ frame. 


A somewhat more physically meaningful formula is found if we ask for 
the cross section in the ‘laboratory’ frame which we define by the condition 
p” = (M,0). The evaluation of the phase space integral requires some care 
and this is detailed in appendix K. The result is 


do a? 


k! 
dQ 4k? sin4(0/2) l 


cos? (0/2) T (8.122) 


In this formula we have neglected the electron mass in the kinematics so that 


k = |k|=w (8.123) 
k = |k'| =o! (8.124) 

and 
q? = —4kk' sin? (0/2) (8.125) 


where @ is the electron scattering angle in this frame, as shown in figure 8.6, 
and 
(k/k') = 1+ (2k/M) sin?(6/2) (8.126) 


from equation (K.20). Note that there is a slight abuse of notation here: in the 
context of results for such laboratory frame calculations, ‘k’ and ‘k” are not 
4-vectors, but rather the moduli of 3-vectors, as defined in equations (8.123) 
and (8.124). 

We shall denote the cross section (8.122) by 


($) ‘no-structure’ cross section. (8.127) 
ns 

It describes essentially the ‘kinematics’ of a relativistic electron scattering 
from a pointlike spin-0 target which recoils. Comparing the result (8.122) 
with equation (8.49), and remembering that here Z = 1 and we are taking 
v — 1 for the electron, we see that the effect of recoil is contained in the 
factor (k'/k), in this limit. We recover the ‘no-recoil’ result (8.49) in the 
limit M — oo, as expected. In particular, referring to (8.125), we understand 
Rutherford’s ‘sin~* 6 /2’ factor in terms of the exchange of a massless quantum, 
via the propagator factor (1/q?)?. 
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FIGURE 8.7 
em? scattering amplitude. 


This ‘no-structure’ cross section also occurs in the cross section for the 
scattering of electrons by protons or muons: the appellation ‘no-structure’ 
will be made clearer in the discussion of form factors which follows. As in 
the case of et Coulomb scattering, the cross sections for e~s* and for ets* 
scattering are identical at this (lowest) order of perturbation theory. 


a 


8.4 Scattering from a non-point-like object: the pion 
form factor in ert + e~ nt 


As remarked earlier, we have been careful not to call the ‘st’ particle a 17, 
because the latter is a composite system which cannot be expected to have 
point-like interactions with the electromagnetic field, as has been assumed 
for the st; rather, in the case of the 7+ it is the quark constituents which 
interact locally with the electromagnetic field. The quarks also, of course, 
interact strongly with each other via the interactions of QCD, and since these 
are strong they cannot (in this case) be treated perturbatively. Indeed, a 
full understanding of the electromagnetically probed ‘structure’ of hadrons 
has not yet been achieved. Instead, we must describe the e~ scattering from 
physical 7+’s in terms of a phenomenological quantity — the pion form-factor 
— which encapsulates in a relativistically invariant manner the ‘non-point-like’ 
aspect of the hadronic state m+. 


The physical process is 
e (k,s) +*(p) > e (k’, 8’) +21 (p’) (8.128) 


which we represent, in general, by figure 8.7. To lowest order in a, the ampli- 
tude is represented diagrammatically by a generalization of figure 8.5, shown 
in figure 8.8, in which the point-like ssy vertex is replaced by the my ‘blob’, 
which signifies all the unknown strong interaction corrections. 
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FIGURE 8.8 
One-photon exchange amplitude in e~7* scattering, including hadronic cor- 
rections at the mmy vertex. 


8.4.1 e` scattering from a charge distribution 


It is helpful to begin the discussion by returning to e~ Coulomb scattering 
again, but this time let us consider the case in which the potential A? (æ) 
corresponds, not to a point charge, but to a spread-out charge density p(x). 
Then A°(a:) satisfies Poisson’s equation 


V? A’ (x) = —Zep(a). (8.129) 
Note that if A" (x) = Ze/4m|z| as in (8.13) then p(x) = (x) (see appendix G) 
and we recover the point-like source. The calculation of the Coulomb matrix 


element will proceed as before, except that now we require, at equation (8.43), 
the Fourier transform 


A°(q) = J STT A (x)d’x (8.130) 


where q = k — k'. To evaluate (8.130), note first that from the definition of 
A? (æ), we can write 


f Eva) Pe = -Ze | 4 (a) dar 
= —ZeF(q) (8.131) 


where the (static) form factor F(q) has been introduced, the Fourier transform 
of p(a), satisfying 


F(0) = fraz =i; (8.132) 
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Condition (8.132) simply means that the total charge is Ze. The left-hand side 
of (8.131) can be transformed by two (three-dimensional) partial integrations 
to give 


[ (viet?) A%(e) ax = -g f 97%) de. (8.133) 
Using this result in (8.131), we find 
~ F 
A°(q) = — Ze. (8.134) 


Thus referring to equation (8.44) for example, the net result of the non-point- 
like charge distribution is to multiply the ‘point-like’ amplitude Ze?/q? by 
the form factor F(q) which in this simple static case has the interpretation of 
the Fourier transform of the charge distribution. So, for this (infinitely heavy 
mt case), the ‘blob’ in figure 8.8 would be represented by F(q). 

To gain some idea of what F(q?) might look like, consider a simple expo- 
nential shape for p(a) : 


1 — a 
p(z) = EIM Iæ (8.135) 
which has been normalized according to (8.132). Then F(q?) is (problem 8.13) 
1 
F(q?) = — c. .1 
(q°) CZESNE (8.136) 


We see that F(q?) decreases smoothly away from unity at q? = 0. The char- 
acteristic scale of the fall-off in |g| is ~ a~! from (8.136), which, as expected 
from Fourier transform theory, is the reciprocal of the spatial fall-off, which is 
approximately a from (8.135); the root mean square radius of the distribution 
(8.135) is actually 12a (problem 8.13). Since q? = 4k’ sin? 0/2, a larger q? 
means a larger 0: hence, in scattering from an extended charge distribution, 
the cross section at larger angles will drop below the point-like value. This is, 
of course, how Rutherford deduced that the nucleus had a spatial extension. 

We now seek a Lorentz-invariant generalization of this static form factor. 
In the absence of a fundamental understanding of the 7? structure coming 
from QCD, we shall rely on Lorentz invariance and electromagnetic current 
conservation (one aspect of gauge invariance) to restrict the general form of 
the my vertex shown in figure 8.8. The use of invariance arguments to place 
restrictions on the form of amplitudes is an extremely general and important 
tool, in the absence of a complete theory. 


8.4.2 Lorentz invariance 


First, consider Lorentz invariance. We seek to generalize the point-like ssy 
vertex (cf (8.98) and comment (1) after (8.99)) 


jis (p,p) = (s+, p'|j4n s(0)|st, p) = elp + p°)“ (8.137) 
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to j",(p,p’), which will include strong interaction effects. Whatever these 
effects are, they cannot destroy the 4-vector character of the current. To 
construct the general form of j”, (p, p’) therefore, we must first enumerate the 
independent momentum 4-vectors we have at our disposal to parametrize the 
4-vector nature of the current. These are just 


p p ad q (8.138) 


subject to the condition 
p=ptq. (8.139) 


There are two independent combinations; these we can choose to be the linear 
combinations 
(pP + P)u 8.140) 


and 
(P — P)u = dp- 8.141) 


Both of these 4-vectors can, in general, parametrize the 4-vector nature of the 
electromagnetic current of a real pion. Moreover, they can be multiplied by 
an unknown scalar function of the available Lorentz scalar products for this 
process. Since 

pP =p? = M? 8.142) 


and 


@ = 2M? — 2p- p' 8.143) 


there is only one independent scalar in the problem, which we may take to be 
q’, the 4-momentum transfer to the vertex. Thus, from Lorentz invariance, 
we are led to write the electromagnetic vertex of a pion in the form 


iki (p p) = (T+, p'ita (0T, p) = ELF (a?)(p' +p)” + G(q?)q"]. (8.144) 


The functions F and G are called ‘form factors’. 

This is as far as Lorentz invariance can take us. To identify the pion form 
factor, we must consider our second symmetry principle, gauge invariance — 
in the form of current conservation. 


8.4.3 Current conservation 


The Maxwell equations (7.65) reduce, in the Lorentz gauge 
ð, AY =0 (8.145) 


to the simple form 


AM = j” (8.146) 


and the gauge condition is consistent with the familiar current conservation 


condition 
Ong" = 0. (8.147) 
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As we have seen in (8.100), the current conservation condition is equivalent 
to the condition 


qu(n* (p) lEn, (0TH (p)) = 0 (8.148) 


on the pion electromagnetic vertex. 
In the case of the point-like s* this is clearly satisfied since 


q: (P +p) =0 (8.149) 
with the aid of (8.142). In the general case we obtain the condition 
qulF(4q°)(p' + p)" + G(4°)q"] = 0. (8.150) 


The first term vanishes as before, but q? 4 0 in general, and we therefore 
conclude that current conservation implies that 


G(q’) =0. (8.151) 


In other words, all the virtual strong interaction effects at the matty ver- 
tex are described by one scalar function of the virtual photon’s squared 4- 
momentum: 

1 H F 2 1 H, 

SP PO y N En (8.152) 

point pion real pion 
F(q’) is the electromagnetic form factor of the pion, which generalizes the 
static form factor F (a°) of section 8.4.1. The pion electromagnetic vertex is 
then 

jka (p,p) = eF (4°) (p + p)". (8.153) 


The electric charge is defined to be the coupling at zero momentum transfer, 
so the form factor is normalized by the condition (cf (8.132)) 


F(0)=1. (8.154) 


To lowest order in «a, the invariant amplitude for e~7* — e7 r is therefore 
given by replacing j“, (p, p') in (8.97) or (8.109) by j“, (p, p’): 


Ment = —ielp + pPI —p)?) (a ) [eats vl, 8) 


(p — p}? 
(8.155) 
It is clear that the effect of the pion structure is simply to multiply the ‘no- 
structure’ cross section (8.122) by the square of the form factor, F(q? = 
(p — p)’). 
For ert — e~x* in the CM frame we may take p = (E, p) and p' = 
(E, p') with |p| = |p'| and E = (m? + p?)‘/?. Then 


qd = (p — p}? = —4p? sin? 0/2 (8.156) 


as in section 8.1, where @ is now the CM scattering angle between p and p’. 
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FIGURE 8.9 
ete” + ntr” scattering amplitude. 


Hence F(q?) can be probed for negative (space-like) values of q?, in the process 
emt + e zt. As in the static case, we expect the form factor to fall off 
as —q? increases since, roughly speaking, it represents the amplitude for the 
target to remain intact when probed by the electromagnetic current. As —q? 
increases, the amplitudes of inelastic processes which involve the creation of 
extra particles become greater, and the elastic amplitude is correspondingly 
reduced. We shall consider inelastic scattering in the following chapter. 

Interestingly, F(q?) may also be measured at positive (time-like) q?, in the 
related reaction ete — mtn as we now discuss. 


re 


8.5 The form factor in the time-like region: ete > ntr” 
and crossing symmetry 


The physical process is 
et (ki, 81) +e (k,s) > m7 (p') +r (p1) (8.157) 


as shown in figure 8.9. We can use this as an instructive exercise in the Feyn- 
man interpretation of section 3.4.4. From that section, we know that the 
invariant amplitude for (8.157) is equal to minus the amplitude for a process 
in which the ingoing antiparticle et with (k1, s1) becomes an outgoing particle 
e7 with (—k,,—s1), and the outgoing antiparticle 7~ with pı becomes an in- 
going particle t+ with —p,. In this way the ‘physical’ (positive 4-momentum) 
antiparticle states (et and 7~) are replaced by appropriate ‘unphysical’ (neg- 
ative 4-momentum) particle states (e~ and m+). These changes transform 
figure 8.9 to figure 8.10. 

If we now look at figure 8.10 ‘from the top downwards’ (instead of from left 
to right — remember that Feynman diagrams are not in coordinate space!), we 
see a process of e~ 77 scattering, namely 


e (k,s)+a*(-—p1) > e7 (—k1, —81) + 7" (p’). (8.158) 
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FIGURE 8.10 
The amplitude of figure 8.9, with positive 4-momentum antiparticles replaced 
by negative 4-momentum particles. 


FIGURE 8.11 

The amplitude of figure 8.10 redrawn so as to obtain a reaction in which the 
initial state has only ‘ingoing’ lines and the final state has only ‘outgoing’ 
lines. 


FIGURE 8.12 
One-photon exchange amplitude for the process of figure 8.11. 
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FIGURE 8.13 
One-photon exchange amplitude for the process of figure 8.9. 


But (8.158) is something we have already calculated! (Though we shall have 
to substitute a negative-energy spinor v for a positive energy one u.) In fact, 
let us redraw figure 8.10 as figure 8.11 to make it look more like figure 8.7. 
Then, to lowest order in a, the amplitude for figure 8.11 is shown in figure 8.12 
(compare figure 8.8). To obtain the corresponding mathematical expression 
for the amplitude iM,.+.-_,,+,-, we simply need to modify (8.155): (i) by 
inserting a minus sign; (ii) by replacing p by —pı and k’ by —kı as in fig- 
ure 8.12; and (iii) by replacing w(k’, s’) by G(k1, 81). This yields the invariant 
amplitude for figure 8.12 as 


iM. e->rtr- 7 —ie(— + DEF + 12 ( 719 pv ) 
+e-4n+ pr Ere) ) | ane 


x [-ied(k1, 51) 7” u(k, s)| (8.159) 


which is represented by the Feynman diagram of figure 8.13 for the original 
process of (8.157) and figure 8.9. 

In the language introduced in section 6.3.3, figure 8.13 is an ‘s-channel 
process’ (s = (k + kı)? = (pi + p’)?) for ete — ata, whereas figure 
8.8 is a ‘t-channel process’ (t = (k — k’)? = (p’ — p)”) for ert > eat. 
However, we have seen that the amplitude for the e'e~ —> ntm process can 
be obtained from the e~7* — e~x* amplitude by making the replacement 
k! + —kı,p > —pı (together with the sign, and ŭ — v). Under these 
replacements of the 4-momenta, the variable t = (k — k’)? = (p — p’)? of 
figure 8.8 becomes the variable s = (k + kı)? = (pı +p’)? of figure 8.13. In 
particular, as is evident in the formula (8.159), the same form factor F is a 
function of the invariant s = (pı +p’)? in process (8.157), and of t = (p — p’)? 
in process (8.128). The interesting thing is that whereas (as we have seen) 
‘t’ is negative in process (8.128), ‘s’ for process (8.157) is the square of the 
total CM energy, which is > 4M? where M is the pion mass (2M is the 
threshold energy for the reaction to proceed in the CM system). Thus the 
form factor can be probed at negative values of its argument in the process 
e-mt+ + e at, and at positive values > 4M? in the process ete > ata. 
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In the next chapter (section 9.5) we shall see how, in the latter process, meson 
resonances dominate F(s). 

The procedure whereby an ingoing/outgoing antiparticle is switched to 
an outgoing/ingoing particle is called ‘crossing’ (the state is being ‘crossed’ 
from one side of the reaction to the other). By an extension of this language, 
ete” — mtn is called the crossed process relative to e~mt — e~x* (or 
vice versa). The fact that the amplitude for a given process and its ‘crossed’ 
analogue are directly related via the Feynman interpretation (or by quantum 
field theory!) is called ‘crossing symmetry’. In the example studied here, what 
is an s-channel process for one reaction becomes a t-channel process for the 
crossed reaction. Essentially, little more is involved than looking in the one 
case from left to right and, in the other, from top to bottom! 


E 
8.6 Electron Compton scattering 
8.6.1 The lowest-order amplitudes 


We proceed to explore some other elementary electromagnetic processes. So 
far we have not considered a reaction with external photons, so let us now 
discuss electron Compton scattering 


y(k, A) +e (p, s) — ylk, XA) +e (p’, 8’) (8.160) 


where the A’s stand for the polarizations of the photons. Since only the y’s 
and e~’s are involved, the interaction Hamiltonian is simply Hi, and it is 
clear that this must act at least twice in the reaction (8.160). By following 
the method of section 6.3.2 one can formally derive what we are here going to 
assume is by now obvious, which is that to order e? (i.e. a in the amplitude) 
there are two contributing Feynman graphs, as shown in figures 8.14(a) and 
(b). The first is an s-channel process, the second a u-channel process. We 
already know the factors for the vertices and for the external electron lines; we 
need to know the factors for the internal electron lines (propagators) and the 
external photon lines. The fermion propagator was given in section 7.2 and is 
i/(¢ — m + ie) for a line carrying 4-momentum q. As regards the ‘external-y’ 
factor, this will arise from contractions of the form (cf (6.90)) 


V2Ep (Ola(k’, X') A“ (a1)|0) = e* (ki, Ae e: (8.161) 


where the evaluation of the vev has used the mode expansion (7.104) and the 
commutation relations (7.108), as usual; note, however, that only transverse 
polarization states (A, A’ = 1 and 2) enter in the external (physical) photon 
lines in figures 8.14(a) and (b). 
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FIGURE 8.14 
O(e?) contributions to electron Compton scattering. 


Thus we add two more rules to the (i)—(v) of section 8.3.1: 


(vi) For an incoming photon of 4-momentum k and polarization A, there 
is a factor e” (k, A); for an outgoing one, €*(k’, X). 


(vii) For an internal spin-4 particle carrying 4-momentum q, there is a 
factor i/(¢ — m + ie) = i(g + m)/(q? — m? + ie). 
The invariant amplitude Me- corresponding to figures 8.14(a) and (b) is 
therefore 


(p+ k+m) 
PEEP me” 


a u(p, 8). 


Mye- = =e} (k', N)eulk, AJA, S)” tulp, s) 


— ee (k', A'Jeulk, A)a(p’, s’)y (8.162) 
To get the spinor factors in expressions such as these, the rule is to start 
at the ingoing fermion line (‘u(p,s)’) and follow the line through until the 
end, inserting vertices and propagators in the right order, until you reach the 
outgoing state (‘a’). Note that here s = (p + k)? and u = (p — k’)?. 


8.6.2 Gauge invariance 


We learned in section 7.3.1 that the gauge symmetry (A“ — A” — Oy) of 
electromagnetism, as applied to real free photons, implied that any photon 
polarization vector e” (k, A) could be replaced by 


ECKA) = et (k, A) + Bk” (8.163) 


where £ is an arbitrary constant. Such a transformation amounted to a change 
of gauge, always remaining within the Lorentz gauge for which e-k = e-k = 0. 
Thus our amplitude (8.162) must be unchanged if we make either or both the 
replacements € + € + 8k and e* — e* + Gk’ indicated in (8.163). This means 
that if in (8.162) we replace either or both of e„(k, A) and e%(k’, A") by ky 
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FIGURE 8.15 
General one-photon process. 


and k!,, respectively, the result has to be zero. This can indeed be verified 
(problem 8.14). 

A similar result is generally true and very important. Consider a process, 
shown in figure 8.15, involving a photon of momentum k”, whose polarization 
state is described by the vector e”. The amplitude A, for this process must 


be linear in the photon polarization vector and thus we may write 
A, = T,, (8.164) 


where J), depends on the particular process under consideration. With the 
Lorentz choice for e” we have 
k-e€=0. (8.165) 


But gauge invariance implies that if we replace e” in (8.164) by k” we must 


get zero: 
(166) 


This important condition on T, is known as a Ward identity (Ward 1950). 


8.6.3 The Compton cross section 


The calculation of the cross section is of considerable interest, since it is re- 
quired when considering lowest-order QCD corrections to the parton model 
for deep inelastic scattering of leptons from nucleons (see the following chap- 
ter and volume 2). We must average |M.,.- |? over initial electron spins and 
photon polarizations and sum over final ones. Consider first the s-channel 
process of figure 8.14(a), with amplitude Me. For this contribution we 
must evaluate 


4 


e 4 ee, x : 
Ile m32 XO Geun (p+ k +m)" un (p+ k+m)y7u' (8.167) 
A,r‘, 8,8/ 


where we have shortened the notation in an obvious way and introduced the 
invariant Mandelstam variable (section 6.3.3) s = (p + k)?. We know how to 
write the spin sums in a convenient form, as a trace. We need to find a similar 
trick for the polarization sum. 
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Consider the general ‘one-photon’ process shown in figure 8.15, with am- 
plitude A, = e” (k, A)T,,, where e"(k, 1) = (0,1,0,0) and e”(k,2) = (0,0, 1,0), 
and k” = (k,0,0, k). Then the required polarization sum would be 

XO ef (k,A)Tue’* (k, A)T} = |i? + |Tal?. (8.168) 
d=1,2 
However, we also know that k"T,, = 0 from the Ward identity (8.166). This 


tells us that 
kTo — kT3 = 0 (8.169) 


and hence To = T3. It follows that we may write (8.168) as 
XO h(k, AJE” (k, ATT? 


A=1,2 


e+ P+ = |To|? (8.170) 


= af TT (8.171) 


Thus we may replace the non-covariant expression ‘$ y] 3 (k, A)e”*(k, A)’ 
by the covariant one ‘—g"”’. The reader may here recall equation (7.118), 
where the ‘pseudo-completeness’ relation involving all four ¢’s was given, a 
similarly covariant expression. This relation corresponds exactly to the right- 
hand side of (8.170), which (in these terms) shows that the A = 0 state enters 
with negative norm. 

Using this result, the term (8.167) becomes 


ef 
Ile — m?) NOW (p+ K+ m)yutiya(p + E+ my 


el 
= 4(s — m?)2 Tri (g +m)y" (p+ K+ m)y" (p+ m) yp (p + k+m)| 


(8.172) 


where, in the second step, we have moved the y, to the front of the trace, 
using (8.71). Expression (8.172) involves the trace of eight y matrices, which 
is beyond the power of the machinery given so far. However, it simplifies 
greatly if we neglect the electron mass — that is, if we are interested in the 
high-energy limit, as we shall be in parton model applications. In that case, 
(8.172) becomes 


et 
ga hep (y + BrP + #)] (8.173) 
which we can simplify using the result (J.3) to 
et 
a Telp (p + Epy + BI 8.174) 
4 
= Trip! pe] using p? =p? = 0 8.175) 
= x -2(p'-k)(p-k) using (8.76) and k? =0 (8.176) 
= —2efu/s 8.177) 
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FIGURE 8.16 
e yt scattering amplitude. 


where u = (p—k’)?. Problem 8.15 finishes the calculation, with the result 
that the spin-averaged squared amplitude is 


; T My? = —2¢4 (2 i =) (8.178) 


s,s’,A,r/ 


The cross section in the CMS is then (cf (6.129)) 


4 2 
Wen - ger (= -*)-= (=-2). (8.179) 
d(cos@) 6477s \ s u s s u 
For parton model calculations, what is actually required is the analogous 
quantity calculated for the case in which the initial photon is virtual (see 
section 9.2). However, the discussion of section 7.3.2 shows that we may 
still use the polarization sum (8.170). A difference will arise in passing from 
(8.175) to (8.176) where we must remember that k? 4 0. Since k? will be 
space-like, we put k? = —Q? and find (problem 8.16) that the spin-averaged 
squared amplitude for the virtual Compton process 


y (k? =—-Q?)+e° >y +e (8.180) 
is given by 
2Q°t 
—2¢4 (: ce oe 2) (8.181) 
S u SU 


CE n 


8.7 Electron muon elastic scattering 


Our final examples of electrodynamic processes are ones in which two fermions 
interact electromagnetically. In this section we discuss the scattering of two 
point-like fermions (i.e. leptons); in the following one we look at the change 
(analogous to those for the 7* as compared to the st) necessitated when one 
fermion is a hadron, for example the proton. 
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FIGURE 8.17 
One-photon exchange amplitude in e7 u~ scattering. 


We shall consider e~ u~ elastic scattering: our notation is indicated in fig- 
ure 8.16. In the lowest order of perturbation theory — the one-photon exchange 
approximation — we can draw the relevant Feynman graph for this process. 
This is shown in figure 8.17. All the elements for the graph have been met 
before and so we can immediately write down the invariant amplitude which 
now depends on four spin labels: 


Me-p- (1, 851", 8') = eti(k’, 8’) qyulk, s)(g"” /g)en(p', r )yvulp, r). (8.182) 


Although experiments with polarized leptons are not uncommon, we shall 
only be concerned with the unpolarized cross section 


d~i XO [Me-p- (r,s; 8'))?. (8.183) 


We perform the same manipulations as in our e~st example and the cross 
section reduces to a factorized form involving two traces: 


1 
4 5 |Me- u- (ar 2) 5 


te [A 
T,r';8,8 


ei {STH + mE + ma} 


q 
x {$Tr[(p + M)” (p + M)y”]} (8.184) 
(eP P Lp M" (8.185) 


where L,,, is the ‘electron tensor’ calculated before (see (8.119)): 
Luv = 2b ky + kyky + © (2) G00 (8.186) 


but now M#” is the appropriate tensor for the muon coupling, with the same 
structure as Luv: 


MY = 2p" p” + pp" + (q? /2)g""]. (8.187) 


256 8. Elementary Processes in Scalar and Spinor Electrodynamics 


To evaluate the cross section we must perform the ‘contraction’ L,,M"”. 
A useful trick to simplify this calculation is to use current conservation for the 
electron tensor L,,. For the electron transition current, the electromagnetic 
current conservation condition is (cf equation (8.100)) 


q" u(k', s’)y,u(k, s)| = 0 (8.188) 


i.e. independent of the particular spin projections s and s’. Since Lpy is 
the product of two such currents, summed and averaged over polarizations, 
current conservation implies the conditions 


PLu = č Lyw =0 (8.189) 


which can be explicitly checked using our result for L,,. The usefulness of 
this result is that in the contraction L,,M"” we can replace p’ in M” by 
(p + q) and then drop all the terms involving q’s, i.e. 


LoM" = Ly, Mee (8.190) 
where 
Mtg = 2[2p"p” + (q?/2)g”].- (8.191) 


The calculation of the cross section is now straightforward. In the ‘laboratory’ 
system, defined (unrealistically) by the target muon at rest 


p” = (M,0,0, 0) (8.192) 


with M now the muon mass, the result is (problem 8.17(a)) 


z = (5). (1 z —— (8.193) 


Note the following points: 


Comment (a) 


The ‘no-structure’ cross section (8.122) for e~s* scattering now appears modi- 
fied by an additional term proportional to tan?(6/2). This is due to the spin-4 
nature of the muon which gives rise to scattering from both the charge and 
the magnetic moment of the muon. 


Comment (b) 


In the kinematics the electron mass has been neglected, which is usually a 
good approximation at high energies. We should add a word of explanation 
for the ‘laboratory’ cross sections we have calculated, with the target muon 
unrealistically at rest. The form of the cross section, (da/dQ)ns, and of the 
cross section for the scattering of two Dirac point particles, will be of great 
value in our discussion of the quark parton model in the next chapter. 
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Comment (c) 


The crossed version of this process, namely ete~ — utu”, is a very important 
monitoring reaction for electron—positron colliding beam machines. It is also 
basic to a discussion of the predictions of the quark parton model for ete~ — 
hadrons, which will be discussed in section 9.5. An instructive calculation 
similar to this one leads to the result (see problem 8.18) 


do a? 2 
where all variables are defined in the ete~ CM frame, q? is now the square of 
the CM energy, and the electron and muon masses have been neglected. The 
total cross section, in the one-photon exchange approximation, is then 


o = 4ra? /3q? = 86.8 nb/q?(GeV’), (8.195) 


where we have made use of equation (B.18) of appendix B. 

The energy dependence of this cross section (x 1/q?) is important, and 
can be understood by a simple dimensional argument. A cross section has di- 
mensions of a squared length, or in natural units (appendix B) inverse squared 
mass or energy. Here both colliding particles are taken to be pointlike, with 
no form factors involving a length parameter, and the mediating quantum is 
massless. At energies much larger than the lepton masses, the only available 
dimensional quantity is the CM energy. It follows that the cross section must 
be inversely proportional to the square of the CM energy, in this ‘pointlike, 
high energy’ limit. By the same token, deviations from this behaviour would 
be evidence for non-pointlike leptonic structure. 


rr 


8.8 Electron—proton elastic scattering and nucleon form 
factors 


In the one-photon exchange approximation, the Feynman diagram for elastic 
electron—proton scattering may be drawn as in figure 8.18, where the ‘blob’ at 
the ppy vertex signifies the expected modification of the point coupling due to 
strong interactions. The structure of the proton vertex can be analysed using 
symmetry principles in the same way as for the pion vertex. The presence 
of Dirac spinors and y-matrices makes this a somewhat involved procedure: 
problem 8.20 is an example of the type of complication that arises. Full de- 
tails of such an analysis can be found in Bernstein (1968), for example. Here, 
however, we shall proceed in a different way, in order to generalize more easily 
to inelastic scattering in the following chapter. We focus directly on the ‘pro- 
ton tensor’ BY”, which is the product of two proton current matrix elements, 
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FIGURE 8.18 
One-photon exchange amplitude in e~ p scattering, including hadronic correc- 
tions at the ppy vertex. 


summed and averaged over polarizations, as is required in the calculation of 
the unpolarized cross section (cf (8.57)): 


„1 : 3y : 
B" = sa A 0ps ap O, s) ((P; P’, s'|Jém,p(0)|P; P, s)) 5 (8.196) 


s,s! 


We remarked in comment (a) after equation (8.193) that for e~ scattering 
from a point-like charged fermion an additional term in the cross section 
was present, corresponding to scattering from the target’s magnetic moment. 
Since a real proton is not a point particle, the virtual strong interaction effects 
will modify both the charge and the magnetic moment distribution. Hence 
we may expect that two form factors will be needed to describe the deviation 
from point-like behaviour. This is in fact the case, as we now show using 
symmetry arguments similar to those of section 8.4. 


8.8.1 Lorentz invariance 


Br” must retain its tensor character: this must be made up using the available 
4-vectors and tensors at our disposal. For the spin-averaged case we have only 


p, q and guv (8.197) 


since p = p +q. The antisymmetric tensor Euvag (see appendix J) must 
actually be ruled out using parity invariance: the tensor BY” is not a pseudo 
tensor since JEn is a vector. It is helpful to remember that €,rog is the 
generalization of €;;; in three dimensions, and that the vector product of two 
3-vectors — a pseudo vector — may be written 


(a x b); = Eijk@jbk. (8.198) 
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8.8.2 Current conservation 
For a real proton, current conservation gives the condition (cf (8.148)) 
qulp: P’, 8 F4a,,(0) pip.) = 0 (8.199) 
which translates to the conditions (cf (8.189)) 
quB =qBY" =0 (8.200) 


on the tensor BH”. 

There are only two possible tensors we can make that satisfy both these 
requirements. One involves p and is constructed to be orthogonal to q. We 
introduce a vector 


Pu = Pu + Ou 8.201) 
and require 
q:p=0. 8.202) 
Hence we find 
By = Pu — (Pp: 4/0") qu 8.203) 
and thus the tensor 
pip” = [p — (p: a/a°)a" |p” — (p: a/a] 8.204) 


satisfies all our requirements. The second tensor must involve g4” and may 
be chosen to be 

=g” +e 1 (8.205) 
which again satisfies our conditions. Thus from invariance arguments alone, 
the tensor B#” for the proton vertex may be parametrized by these two ten- 
sors, each multiplied by an unknown function of q?. If we define 


B” = 4A(q?)[p" — (p- 4/7) a" |p” — (p /7) a") 
+ 2M? B(q’)(—g’” + o"q”/q°) (8.206) 


the cross section in the laboratory frame is (problem 8.19) 


Z = (5). [A + B tan?(0/2)]. (8.207) 


Formula (8.207) implies that a plot of (da/dQ)/(do/dQ)ns versus tan? 0/2, at 
fixed q?, will be a straight line with slope B and intercept A. 

The functions A and B may be related to the ‘charge’ and ‘magnetic’ form 
factors of the proton. The Dirac ‘charge’ and Pauli ‘anomalous magnetic 
moment’ form factors, Fı and Fə respectively, are defined by 


(p; p’, 8'|74n » (0) Ip; p, $) 


. F 3 
= (+e)ū(p', s") |7 Fi(q?) + WFD) gwg, u(p,s) (8.208) 
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with the normalization 


Fi(0) = 1 (8.209) 
F(0) = 1 (8.210) 


and the magnetic moment of the proton is not one (nuclear) magneton, as for 
an electron or muon (neglecting higher-order corrections), but rather up = 
1+ with « = 1.79. Problem 8.20 shows that the wyu piece in (8.208) can 
be rewritten in terms of t(p+p’)"u/2M and tio"”’ q,u/2M. The first of these 
is analogous to the interaction of a charged spin-0 particle. As regards the 
second, we note that o” is just 


o” = zih“, y] (8.211) 


which reduces to the Pauli spin matrices for the space-like components 
g” = a w) (8.212) 


with our representation of y-matrices (o’! is a 4 x 4 matrix, o* is 2 x 2, and 4, 
j and k are in cyclic order). The second term in this ‘Gordon decomposition’ 
of uy"u thus corresponds to an interaction via the spin magnetic moment — 
with, in fact, g = 2. Thus the addition of the « term in (8.208) corresponds 
to an ‘anomalous’ magnetic moment piece. In terms of Fı and F> one can 
show that 


A = FERF (8.213) 
B = 21(Fi+K6Fe)’ (8.214) 

where 
T = —q?/4M°. (8.215) 


The point-like cross section (8.193) is recovered from (8.207) by setting Fı = 1 
and « = 0 in (8.213) and (8.214). 

The functions F; and F> are, in turn, usually expressed in terms of the elec- 
tric and magnetic form factors Gg and Gm, defined by Gg = Fi- Tk F2, GM = 
Fı +KF2. We then find A = (G + 7G?;,)/(1 +7) and B = 27G3,. The cross 
section formula (8.207), written in terms of Gg and Gy, is known as the 
‘Rosenbluth’ cross section. 

Experimental data indicate that the q?-dependences of Gg and Gm for 
the proton, and of Gym for the neutron, are all quite well represented by the 
function F(q?) of (8.136) with q? replaced by —q? and with a ~ 0.84 GeV~’," 
at least for values of —q? up to a few GeV? (see, for example, Perkins 1987, 
section 6.5). 

Before we leave elastic scattering it is helpful to look in some more detail 
at the kinematics. It will be sufficient to consider the ‘point-like’ case, which 
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we shall call e~ u”, for definiteness. Energy and momentum conservation at 
the u* vertex gives the condition 


ptq=p (8.216) 
with the mass-shell conditions (M is the u™ mass) 
pP? =p = M°. (8.217) 
Hence for elastic scattering we have the relation 
2p-q=-¢. (8.218) 


It is conventional to relate these invariants to the corresponding laboratory 
frame (p“ = (M,0)) expressions. Neglecting the electron mass so that? 


k = |k|=w 8.219) 
k = |k'| =o! 8.220) 
we have 
q? = —2kk'(1 — cos 6) = —4kk’ sin? (0/2) 8.221) 
and 
p-q=M(k-k')=Mv 8.222) 


where v is the energy transfer q? in this frame. To avoid unnecessary minus 
signs, it is convenient to define 


Q? = —¢? = 4kk' sin? (6/2) (8.223) 
and the elastic scattering relation between p -q and q? reads 
v=Q?/2M (8.224) 
or 
Ki 1 
k 1+ (2k/M) sin?(0/2)" 


Remembering, therefore, that for elastic scattering k’ and @ are not indepen- 
dent variables, we can perform a change of variables (see appendix K) in the 
laboratory frame 


(8.225) 


dQ. = 2m d(cos 0) = (T/k?) dQ? (8.226) 
and write the differential cross section for e~ u* scattering as 


do ta 1 


ag? = TR aint a) ta A (9/2) + 27 sin’ (0/2)]. (8.227) 


2 As after equation (8.126), note again that in the present context ‘k’ and ‘k’’ are not 
4-vectors but the moduli of 3-vectors. 
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FIGURE 8.19 

Physical regions for e~ p scattering in the Q?, v variables: A, kinematically 
forbidden region; B, line of elastic scattering (Q? = 2Mv); C, lines of res- 
onance electroproduction; D, photoproduction; E, deep inelastic region (Q? 
and v large). 


For elastic scattering v is not independent of Q? but we may formally write 
this as a double-differential cross section by inserting the 6-function to ensure 
this condition is satisfied: 


d?o Ta? 


on eS [4 (S) eea] 


This is the cross section for the scattering of an electron from a point-like 
fermion target of charge e and mass M. 

It is illuminating to plot out the physically allowed regions of Q? and 
v (figure 8.19). Elastic e~p scattering corresponds to the line Q? = 2Mv. 
Resonance production e~p —> e~N* with p'? = M” corresponds to lines 
parallel to the elastic line, shifted to the right by M 2 _ M? since 


2Mv = Q? + M”? — M?. (8.229) 


Experiments with real photons, Q? = 0, correspond to exploring along the 
y-axis. In the next chapter we switch our attention to so-called deep inelastic 
electron scattering — the region of large Q? and large v. 
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Problems 


8.1 Consider a matrix element of the form 
M= J Bx J dt eti?ezg, Ale iPr, 


Assuming the integration is over all space-time and that 


A? +0 as t > -oo 


and 
|A| > 0 as |x| — 00 


use integration by parts to show 
(a) fa etips gj A% Pe — (zipo) f at etipre 40 Q—ipi-a 
(b) i dia ehh . Ae? = tipp. (J deete goize) 
Hence show that 
fafa errr (0, A” + Að) 


= —i(pe +pi)u | da f deer re-ie, 


8.2 Verify equation (8.27). 


8.3 Evaluate (8.31) and interpret the result physically (i.e. compare it with 
(8.27)). 


8.4 


(a) Using the u-spinors normalized as in (3.73), the $1? of (8.47), and 
the result for ø - Ao - B from problem 3.4(b), show that 


r > Alf ae e 1 
ut(k',s' = 1)u(k,s = 1) = (E+m) f+ kek | igto-k a 


Etm? (B+mp 


(b) For any vector A = (At, A?, A3), show that tto - Ad! = A?. Find 
similar expressions for ¢''a - Ad’, d?'a - Ag! oo - Ad?. 


(c) Show that the S of (8.46) is equal to 


7 r k'-k 1? (k xk}? 
S=(E+m) {be eI oe 
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(d) Using cos0 = k-k'/(|k||k’|), |k| = |k’| and v = |k|/E, show that 


S = (2E)?(1 — v sin? 6/2). 


8.5 Verify equation (8.55). 

8.6 Check that Pytty = q”. 

8.7 Verify equation (8.79) for the lepton tensor Lt”. 

8.8 Evaluate L” as in equation (8.80). 

8.9 Verify equation (8.87). 

8.10 Verify equation (8.96) for the e~st + e~s* amplitude to O(e?). 


8.11 Check that both the scalar and the spinor current matrix elements (8.27) 
and (8.55), satisfy 0,j"(v) = 0. 


8.12 Verify equation (8.120). 


8.13 Verify equation (8.136) for the Fourier transform of p(x) given by (8.135). 
Show that the mean square radius of the distribution (8.135) is 12a?. 


8.14 Check the gauge invariance of Me- given by (8.162), by showing that 
if €,, is replaced by k,,, or & by kf, the result is zero. 


8.15 


(a) The spin-averaged squared amplitude for lowest-order electron Comp- 
ton scattering contains the interference term 


Y Mou 


ye 
A,r',8,8/ 


where (s) and (u) refer to the s- and u-channel processes of fig- 
ure 8.14(a) and (b) respectively. Obtain an expression analogous 
to (8.172) for this term, and prove that it is, in fact, zero. [Hint: 
work in the massless limit, and use relations (J.4) and (J.5).| 


(b) Explain why the term 
u) ay(u)* 
ys, ML MS 
A,r’',8,8/ 
is given by (8.177) with s and u interchanged. 


8.16 Recalculate the interference term of problem 8.16(a) for the case k? = 
—Q? (but with k’? = p? = p’? = 0), and hence verify (8.181). 


Problems 


8.17 
(a) 


Derive an expression for the spin-averaged differential cross section 
for lowest-order e~ u~ scattering in the laboratory frame, defined 
by p = (M,0) where M is now the muon mass, and show that it 
may be written in the form 


da do 
— = | — ) [1- (q?/2M”) tan? (0/2 
a (Gq), E- 2M) tan?(0/2) 
where the ‘no-structure’ cross section is that of e~s* scattering 
(appendix K) and the electron mass has been neglected. 

Neglecting all masses, evaluate the spin-averaged expression (8.184) 


in terms of s,t and u and use the result 


do 1 1 
Gn Teg È Mew (srs) 


Ie $ 
T,1'58,8 


to show that the e7 u~ cross section may be written in the form 
do 4na?1 u? 
— = ——-| 1+ >]. 
dt e 2 2 


Show also that by introducing the variable y, defined in terms of 
laboratory variables by y = (k — k’)/k, this reduces to the result 


8.18 Consider the process ete~ > utu” in the CM frame. 


(a) 
(b) 


Draw the lowest-order Feynman diagram and write down the cor- 
responding amplitude. 


Show that the spin-averaged squared matrix element has the form 
(4ra)? 


g Le) yy Llp)” 


where q? is the square of the total CM energy, and L(e) depends on 
the e~ and e+ momenta and L(y) on those of the pt, u7. 


Evaluate the traces and the tensor contraction (neglecting lepton 
masses): (i) directly, using the trace theorems; and (ii) by using 
crossing symmetry and the results of section 8.7 for e7 4~ scattering. 
Hence show that 


|M]? = (4ra)? (1 + cos? 0) 
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(a) Total cross sections for e 


High Energy Physics 4th edn, courtesy Cambridge University Press.) 


8.19 Verify equation (8.207). [Hint: as in equation (8.191) the terms in q“ 


where @ is the CM scattering angle, and that the CM differential 


cross section is 


do a? 


2 2 
ma” 1g + cos’ 0). 

Hence show that the total cross section is (see equation (B.18) of 
appendix B) 


o = 4ra? /3q? = 86.8 nb/q?(GeV’). 


Figure 8.20 shows data (a) for ø in ete” + pty and ete” > 
7t*7~ and (b) for the angular distribution in ete~ —> pty. Note 
that s = q’. The data in figure 8.20(a) agree well with the predic- 
tion above for ø. The broken curve in figure 8.20(b) shows the pure 
QED prediction of part (c) for se 


It is clear that, while the distribution has the general 1+ cos? 6 form 
as predicted, there is a small but definite forward—backward asym- 
metry. This arises because, in addition to the y-exchange amplitude 
there is also a Z°-exchange amplitude (see section 22.3 of volume 2) 
which we have neglected. Such asymmetries are an important test 
of the electroweak theory. They are too small to be visible in the 
total cross sections in figure 8.20(a). 


and q” in BY” may be neglected because of the conditions (8.189).] 


te7 + uty andete” +7177; (b) differential 
cross section for ete” — wtp. (From D H Perkins 2000 Introduction to 
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8.20 Starting from the expression 


gH” 


alp izge) 


where q = p' — p and o#” = žij”, y’], use the Dirac equation and properties 
of y-matrices to prove the ‘Gordon decomposition’ of the current 


ayut) = a(n) (PEE i) up), 
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Deep Inelastic Electron—Nucleon Scattering 
and the Parton Model 


We have obtained the rules for doing calculations of simple processes in quan- 
tum electrodynamics for particles of spin-0 and spin-3, and many explicit 
examples have been considered. In this chapter we build on these results to 
give an (admittedly brief) introduction to a topic of central importance in par- 
ticle physics, the structure of hadrons as revealed by deep inelastic scattering 
experiments (the equally important neutrino scattering experiments will be 
discussed in volume 2). We do this partly because the necessary calculations 
involve straightforward, illustrative and eminently practical applications of 
the rules already obtained, but, more particularly, because it is from a com- 
parison of these calculations with experiment that compelling evidence was 
obtained for the existence of the point-like constituents of hadrons — quarks 
and gluons — the interactions of which are described by QCD. 


————— a 


9.1 Inelastic electron—proton scattering: kinematics and 
structure functions 


At large momentum transfers there is very little elastic scattering: inelastic 
scattering, in which there is more than just the electron and proton in the final 
state, is much more probable. The simplest inelastic cross section to measure 
is the so-called ‘inclusive’ cross section, for which only the final electron is 
observed. This is therefore a sum over the cross sections for all the possible 
hadronic final states: no attempt is made to select any particular state from 
the hadronic debris created at the proton vertex. This process may be repre- 
sented by the diagram of figure 9.1, assuming that the one-photon exchange 
amplitude dominates. The ‘blob’ at the proton vertex indicates our ignorance 
of the detailed structure: X indicates a sum over all possible hadronic final 
states. However, the assumption of one-photon exchange, which is known 
experimentally to be a very good approximation, means that, as in our pre- 
vious examples (cf (8.118) and (8.185)), the cross section must factorize into 
a leptonic tensor contracted with a tensor describing the hadron vertex: 


do ~ LyyW*"" (q, p). (9.1) 
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X,p' 
(Unobserved hadrons) 


FIGURE 9.1 
Inelastic electron—proton scattering, in one-photon exchange approximation. 


The lepton vertex is well described by QED and takes the same form as 
before: 
Luv = 2[kiky + kyky + (2/2) gv). (9.2) 


For the hadron tensor, however, we expect strong interactions to play an im- 
portant role and we must deduce its general structure by our powerful invari- 
ance arguments. We will only consider unpolarized scattering and therefore 
perform an average over the initial proton spins. The sum over final states, X, 
includes all possible quantum numbers for each hadronic state with total mo- 
mentum p’. For an inclusive cross section, the final phase space involves only 
the scattered electron. Moreover, since we are not restricting the scattering 
process by picking out any specific state of X, the energy k’ and the scattering 
angle 0 of the final electron are now independent variables. In W“”(q,p) the 
sum over X includes the phase space for each hadronic state restricted by the 
usual 4-momentum-conserving 6-function to ensure that each state in X has 
momentum p’. Including some conventional factors, we define W*”” (q, p) by 
(see problem 9.1) 


v 1 1 ^ ^v 
eW” (q,p) = XOY ip, slfn,p(O)IX; p’) (X; plm, p (0) IPs P, 8) 
s X 


4nM 2 
x (27)*64(p+q—p’'). (9.3) 


How do we parametrize the tensor structure of W#”? As usual, Lorentz in- 
variance and current conservation come to our aid. There is one important 
difference compared with the elastic form factor case of section 8.8. For inclu- 
sive inelastic scattering there are now two independent scalar variables. The 

relation 
p=p+q (9.4) 

leads to 
p’ =M?+4+2p-q4+¢ (9.5) 


where M is the proton mass. In this case, the invariant mass of the hadronic 


final state is a variable 
p? =W? (9.6) 
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and is related to the other two scalar variables 


p-q=Mv (9.7) 
and (cf (8.223)) 
P =-@Q? (9.8) 
by the condition (cf (8.229)) 
2Mv = Q? + W? — M?. (9.9) 


Our invariance arguments lead us to the same tensor structure as for elastic 
electron—proton scattering, but now the functions A(q?), B(q?) are replaced 
by ‘structure functions’ which are functions of two variables, usually taken to 
be v and Q?. The conventional definition of the proton structure functions 
Wi and W2 is 


W” (q, p) = (—gh” + aq’ /q?)Wi(Q?,v) 


+ [p" — (p- a/a?)a"\[p” — (p: 4/7] M? W (Q? v). 


(9.10) 

Inserting the usual flux factor together with the final electron phase space 

leads to the following expression for the inclusive differential cross section for 
inelastic electron—proton scattering (see problem 9.1): 


4na\? 1 d3k' 
do = | —— |) —— 2 ML, WH ——. J 

g ( q ) 4[(k - p)? — m? M?) TML W Dw! (27)? (9.11) 
In terms of ‘laboratory’ variables, neglecting electron mass effects, this yields 
(problem 9.2(a)) 


d?o a? 
——— = —— [W cos? (9/2) + 2W, sin? (0/2). 9.12 
Remembering now that cos@ and k’ are independent variables for inelastic 
scattering, we can change variables from cos@ and k’ to Q? and v, assuming 
azimuthal symmetry for the unpolarized cross section. We have 


Q? = 2kk'(1—cos@) (9.13) 
vy = k-k (9.14) 

so that (problem 9.2(b)) 

1 
d(cos 0) dk! = spp ie dv (9.15) 
and 
d?o Ta 1 3 8 

—|W2 cos" (0/2) + 2W, sin“ (0/2)]. (9.16) 


dQ2dv 4k? sin®(9/2) kk! 


272 9. Deep Inelastic Electron—Nucleon Scattering and the Parton Model 


Yet another choice of variables is sometimes used instead of these, namely the 
dimensionless variables 


x = Q?/2Mv (9.17) 
whose significance we shall see in the next section, and 
y=u/k (9.18) 


which is the fractional energy transfer in the ‘laboratory’ frame. Note that 
relation (8.224) shows that x = 1 for elastic scattering. The Jacobian for the 
transformation from Q? and v to x and y is (see problem 9.2(b)) 


dQ? dv = 2M k’°y da dy. (9.19) 


We emphasize that the foregoing — in particular (9.3), (9.12) and (9.16) — is all 
completely general, given the initial one-photon approximation. The physics 
is all contained in the v and Q? dependence of the two structure functions W1 
and Wo. 

A priori, one might expect Wı and W2 to be complicated functions of v 
and Q?, reflecting the complexity of the inelastic scattering process. How- 
ever, in 1969 Bjorken predicted that in the ‘deep inelastic region’ — large v 
and Q?, but Q?/v finite — there should be a very simple behaviour. He pre- 
dicted that the structure functions should scale, i.e. become functions not of 
Q? and v independently but only of their ratio Q?/v. It was the verification 
of approximate ‘Bjorken scaling’ that led to the development of the modern 
parton model. We therefore specialize our discussion of inelastic scattering to 
the deep inelastic region. 


E 


9.2 Bjorken scaling and the parton model 


From considerations based on the quark model current algebra of Gell-Mann 
(1962), Bjorken (1969) was led to propose the following ‘scaling hypothesis’: 
in the limit 
Q? > œ 
with x = Q?/2Mv fixed (9.20) 
V —- CO 


the structure functions scale as 


MW,(Q?,v) > Filz) (9.21 
vW (Q?, v) = F(x). 
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FIGURE 9.2 

Bjorken scaling: the structure function vW2 (a) plotted against x for different 
Q? values (Attwood 1980, courtesy SLAC) and (b) plotted against Q? for the 
single x value, x = 0.25 (Friedman and Kendall 1972). 


We must emphasize that the physical content of Bjorken’s hypothesis is that 
the functions Fi (x) and F(x) are finite?. 

Early experimental support for these predictions (figure 9.2) led initially to 
an examination of the theoretical basis of Bjorken’s arguments and to the for- 
mulation of the simple intuitive picture provided by the parton model. Closer 
scrutiny of figure 9.2(a) will encourage the (correct) suspicion that, in fact, 
there is a small but significant spread in the data for any given æ value. In 
volume 2 we shall give an introduction to the way in which QCD corrections 
to the parton model lead to predictions for logarithmic (in Q?) violations of 
simple scaling behaviour, which are in excellent agreement with experiment. 
These violations are particularly large at small values of x; for x greater than 
about 0.1, the structure functions are substantially independent of Q?, for 
a given x. The scaling predicted by Bjorken is certainly the most immedi- 
ate gross feature of the data, and an understanding of it is of fundamental 
importance. 

How can the scaling be understood? Feynman, when asked to explain 
Bjorken’s arguments, gave an intuitive explanation in terms of elastic scatter- 
ing from free point-like constituents of the nucleon, which he dubbed ‘partons’ 
(Feynman 1969). The essence of the argument lies in the kinematics of elastic 
scattering of electrons by free point-like charged partons: we will therefore be 
able to use the results of the previous chapters to derive the parton model 
results. At high Q? and v it is intuitively reasonable (and in fact the basis for 


1Tt is always possible to write W(Q?,v) = f(x,Q?), say, where f(x, Q?) will tend to 
some function F(a) as Q? > œo with « fixed. F(x) may, however, be zero, finite or infinite. 
The physics lies in the hypothesis that, in this limit, a finite part remains. 
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e7, k' 


FIGURE 9.3 
Photon-parton interaction. 


the light-cone and short-distance operator approach (Wilson 1969) to scaling) 
that the virtual photon is probing very short distances and time scales within 
the proton. In this situation, Feynman supposed that the photon interacts 
with small (point-like) constituents within the proton, which carry only a cer- 
tain fraction f of the proton’s energy and momentum (figure 9.3). Over the 
short time scales involved in the transfer of a large amount of energy v, and 
at the short distances probed at large Q?, the struck constituents can perhaps 
be treated as effectively free and independent. (This is in sharp contrast to 
the case of elastic scattering, where the constituents are acting coherently.) 
We then have the idealized elastic scattering process shown in figure 9.4. It 
is the kinematics of the elastic scattering condition for the partons that leads 
directly to a relation between Q? and v and hence to the observed scaling 
behaviour. The original discussion of the parton model took place in the 
infinite-momentum frame of the proton. While this has the merit that it 
eliminates the need for explicit statements about parton masses and so on, it 
also obscures the simple kinematic origin of the scaling. For this reason, at the 
expense of some theoretical niceties, we prefer to perform a direct calculation 
of electron—parton scattering in close analogy with our previous examples. 

We first show that the fraction f is none other than Bjorken’s variable x. 
For a parton of type i we write 


pit = fp! (9.23) 
and, roughly speaking”, we can imagine that the partons have mass 

mi x fM. (9.24) 
Then, exactly as in (8.216) and (8.217), energy and momentum conservation 


2Explicit statements about parton transverse momenta and masses, such as those made 
in equations (9.23) and (9.24), are unnecessary in a rigorous treatment, where such quan- 
tities can be shown to give rise to non-leading scaling behaviour (Sachrajda 1983). 
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FIGURE 9.4 
Elastic electron—parton scattering. 


at the parton vertex, together with the assumption that the struck parton 
remains on-shell (as indicated by the fact that in figure 9.4 the partons are 
free), imply that 

(a+ fp)? = m? (9.25) 


which, using (9.8), (8.222) and (9.24), gives 
f=Q/2Mv=r. (9.26) 


Thus the fact that the nucleon structure functions do seem to depend 
(to a good approximation) only on the variable x is interpreted physically as 
showing that the scattering is dominated by the ‘quasi-free’ electron—parton 
process shown in figure 9.4. In section 11.5.3 we shall see how the ‘asymptotic 
freedom’ property of QCD suggests a dynamical understanding of this picture, 
as will be discussed further in chapter 15 of volume 2. 

What sort of values for x do we expect? Consider an analogous situation 
— electron scattering from deuterium. Here the target (the deuteron) is un- 
doubtedly composite, and its ‘partons’ are, to a first approximation, just the 
two nucleons. Since mx ~ $mp, we expect to see the value x ~ 4 (cf (9.24)) 
favoured; x = 1 here would correspond to elastic scattering from the deuteron. 
A peak at x & 4 is indeed observed (figure 9.5) in quasi-elastic e~ d scattering 
(the broadening of the peak is due to the fact that the constituent nucleons 
have some motion within the deuteron). By ‘quasi-elastic’ here we mean that 
the incident electron scatters off ‘quasi-free’ nucleons, an approximation we 
expect to be good for incident energies significantly greater than the binding 
energy of the n and p in the deuteron (~2 MeV). What about the nucleon 
itself, then? A simple three-quark model would, on this analogy, lead us to 
expect a peak at x ~ E, but the data already shown (figure 9.2(a)) do not 
look much like that. Perhaps there is something else present too — which we 
shall uncover as our story proceeds. 

Certainly it seems sensible to suppose that a nucleon contains at least some 
quarks (and also antiquarks) of the type introduced in the simple composite 
models of the nucleon (section 1.2.2). If quarks are supposed to have spin-$, 
then the scattering of an electron from a quark or antiquark — generically a 
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FIGURE 9.5 
Structure function for quasi-elastic ed scattering, plotted against x (Attwood 
1980, courtesy SLAC). 


charged parton — of type i, charge e; (in units of e) is just given by the eu 
scattering cross section (8.228), with obvious modifications: 


da" Tta? 1 a Qa 
dQ@2dv 4k? sint (0/2) kK! (« cos” (0/2) + e; Ge (0/2) 
eee) (9.27) 


This is to be compared with the general inclusive inelastic cross section formula 
written in terms of W; and Ws: 


d?o Ta 1 


dod > Tanto kk cos*(9/2) + W12 sin? (0/2)]. (9.28) 


Thus the contribution to Wı and W2 from one parton of type i is immediately 
seen to be 


> 2 
ĉi IMa? 
e (v — Q?/2Mz) (9.30) 


wi ô(v — Q?/2Mz) (9.29) 


Il 


Ww; 


where we have set m; = zM. At large v and Q? it is assumed that the 
contributions from different partons add incoherently in cross section. Thus, 
to obtain the total contribution from all quark partons, we must sum over the 
contributions from all types of partons, 7, and integrate over all values of x, 
the momentum fraction carried by the parton. The integral over x must be 
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weighted by the probability f;(«) for the parton of type i to have a fraction x of 
momentum. These probability distributions — or parton distribution functions 
(PDFs) — are not predicted by the model and are, in this parton picture, 
fundamental parameters of the proton. The structure function W2 becomes 


1 
W2(v, Q?) = 5 dz fi(x)e?6(v — Q?/2M 2). (9.31) 
z Jo 
Using the result for the Dirac ð-function (see appendix E, equation (E.34)) 
d(a@ — xo) 
d(g(xz)) = —— 9.32 
1) = Tag ale oe 
where xo is defined by g(2o) = 0, we can rewrite 
ô(v — Q?/2M zx) = (2/v)5(x — Q?/2Mv) (9.33) 
under the x integral. Hence we obtain 
vWo(v,Q?) = S Ga fi(a) = F(x) (9.34) 
which is the desired scaling behaviour. Similar manipulations lead to 
MW\,(v, Q?) = F(x) (9.35) 
where 
20 F\ (x) = Fo(x). (9.36) 


This relation between F} and F> is called the Callan—Gross relation (see 
Callan and Gross 1969): it is a direct consequence of our assumption of spin- 
4 partons. The physical origin of this relation is best discussed in terms of 
virtual photon total cross sections for transverse (A = £1) virtual photons 
and for a longitudinal/scalar (A = 0) virtual photon contribution. The lon- 
gitudinal/scalar photon is present because q? 4 0 for a virtual photon (see 
comment (4) in section 8.3.1). However, in the discussion of polarization 
vectors a slight difference occurs for space-like q?. In a frame in which 


q” = (q?,0,0, 9°) (9.37) 


the transverse polarization vectors are as before 


mie 


e#(\ = +1) = #271? (0,1, i, 0) (9.38) 


with normalization (see equation (7.87)) 
e g= =l, (9.39) 
To construct the longitudinal/scalar polarization vector, we must satisfy 


q:€=0 (9.40) 
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and so are led to the result 


(à = 0) = (1//Q?)(4°, 0,0, 9°) (9.41) 


with 
(A =0) =41. (9.42) 


The precise definition of a virtual photon cross section is obviously just a 
convention. It is usually taken to be 


o\(yp > X) = (4? a0/K Jet (Ae (AW (9.43) 


by analogy with the total cross section for real photons of polarization A 
incident on an unpolarized proton target. Note the presence of the factor W#” 
defined in (9.3). The factor K is the flux factor; for real photons, producing 
a final state of mass W, this is just the photon energy in the rest frame of the 
target nucleon: 

K = (W° — M°) /2M. (9.44) 


In the so-called ‘Hand convention’, this same factor is used for virtual photons 
which produce a final state of mass W. With these definitions we find (see 


problem 9.3) that the transverse (A = +1) photon cross section 
4ra) 1 š 7 
p= ( z ) 5 So eh. A)en(A)We (9.45) 
dA=H+1 
is given by 
op = (40? a/K)W, (9.46) 

and the longitudinal/scalar cross section 

og = (4m? a/K)e (A = O)e,(A = 0)WHY (9.47) 
by 

og = (4n?a/K)[(1 + v?/Q?)We — Wi]. (9.48) 


In fact these expressions give an intuitive explanation of the positivity prop- 
erties of Wı and W2, namely 


w >0 (9.49) 
(1 +12/Q2)W2 — W, > 0. (9.50) 


The combination in the A = 0 cross section is sometimes denoted by Wy: 
Wi = (1 + v?/Q?)W2 — Wi. (9.51) 
The scaling limit of these expressions can be taken using 


VWa = t3 (9.52) 
MW, > F (9.53) 
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FIGURE 9.6 
Photon—parton interaction in the Breit frame. 


and x = Q?/2Mv finite, as Q? and v grow large. We find 


gi TA F(a) (9.54) 
and 
ag > (4r°a/M K)(1/2x)(F> — 2z F1) (9.55) 


where we have neglected a term of order M Fz/v in the last expression. Thus 
the Callan—Gross relation corresponds to the result 


os/or —>0 (9.56) 


in terms of photon cross sections. 
A parton calculation using point-like spin-0 partons shows the opposite 
result, namely 
or/aos — 0. (9.57) 


Both these results may be understood by considering the helicities of partons 
and photons in the so-called parton Breit or ‘brick-wall’ frame. The partic- 
ular frame is the one in which the photon and parton are collinear and the 
3-momentum of the parton is exactly reversed by the collision (see figure 9.6). 
In this frame, the photon transfers no energy, only 3-momentum. The van- 
ishing of transverse photon cross sections for scalar partons is now obvious. 
The transverse photons bring in +1 units of the z-component of angular mo- 
mentum: spin-0 partons cannot absorb this. Thus only the scalar A = 0 cross 
section is non-zero. For spin-3 partons the argument is slightly more compli- 
cated in that it depends on the helicity properties of the y, coupling of the 
parton to the photon. As is shown in problem 9.4, for massless spin-4 particles 
the y, coupling conserves helicity — i.e. the projection of spin along the direc- 
tion of motion of the particle. Thus in the Breit frame, and neglecting parton 
masses, conservation of helicity necessitates a change in the z-component of 
the parton’s angular momentum by +1 unit, thereby requiring the absorp- 
tion of a transverse photon (figure 9.7). The Lorentz transformation from the 
parton Breit frame to the ‘laboratory’ frame does not affect the ratio of trans- 
verse to longitudinal photons, if we neglect the parton transverse momenta. 
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FIGURE 9.7 
Angular momentum balance for absorption of photon by helicity-conserving 
spin-4 parton. 
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FIGURE 9.8 

The ratio 22F,/Fo: 0, 1.5 < Q? < 4 GeV’; e, 0.5 < Q? < 11 GeV?; x, 12 < 
Q? < 16 GeV?. (Figure from D H Perkins Introduction to High Energy Physics 
3rd edn, copyright 1987; reprinted by permission of Pearson Education, Inc., 
Upper Saddle River, NJ.) 


These arguments therefore make clear the origin of the Callan-Gross rela- 
tion. Experimentally, the Callan-Gross relation is reasonably well satisfied 
in that R = og/or is small for most, if not all, of the deep inelastic regime 
(figure 9.8). This leads us to suppose that the electrically charged partons 
coupling to photons have spin-5. 
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TS 


9.3 Partons as quarks and gluons 


We now proceed a stage further, with the idea that the charged partons are 
quarks (and antiquarks). If we assume that the photon only couples to these 
objects, we can make more specific scaling predictions. The quantum numbers 
of the quarks have been given in Table 1.2. For a proton we have the result 
(cf (9.34)) 


F3?(x) = x{ $[u(x) + a(a)] + $[d(x) + d(x) + s(x) + 5(x)] +-+} (9.58) 


where u(x) is the probability distribution for u quarks in the proton, ti(a) for 
u antiquarks and so on in an obvious notation, and the dots indicate further 
possible flavours. So far we do not seem to have gained much, replacing 
one unknown function by six or more unknown functions. The full power of 
the quark parton model lies in the fact that the same distribution functions 
appear, in different combinations, for neutron targets, and in the analogous 
scaling functions for deep inelastic scattering with neutrino and antineutrino 
beams (see volume 2). For electron scattering from neutron targets we can use 
I-spin invariance (see for example Close 1979, or Leader and Predazzi 1996) 
to relate the distribution of u and d quarks in a neutron to the distributions 
in a proton, and similarly for the antiquarks. The results are 


x) = d(x) (9.59) 
a (9.60) 
P(x) = (x)= s(x) (x)= (x)= 5(x). (9.61) 


a 
ğe] 
8 
II 
Bi 
Pes | 
8 
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III 
O 
2 
kel 


Hence the scaling function for en scattering may be written 


F(a) = x{ ffd(x) + d(x)] + $[u(x) + U(x) + s(x) + 5(x)] +--+}. (9.62) 


The quark distributions inside the proton and neutron must satisfy some 
constraints. Since both proton and neutron have strangeness zero, we have a 
sum rule (treating only u, d and s flavours from now on) 


| dx [s(x) — 5(a)] = 0. (9.63) 


Similarly, from the proton and neutron charges we obtain two other sum rules: 


f dz {3[u(x) — a(x)| — 3[d(x) - d(x)} = 1 (9.64) 
dz {3[d(x) —d(z)| — s[u(x) —u(x)|} = 0. (9.65) 
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These are equivalent to the sum rules 


ps f daa] (9.66) 


II 


f dsa — dta)| (9.67) 


which are, of course, just the excess of u and d quarks over antiquarks inside 
the proton. Testing these sum rules requires neutrino data to separate the 
various structure functions, as we shall explain in volume 2, chapter 20. 

One can gain some further insight if one is prepared to make a model. For 
example, one can introduce the idea of ‘valence’ quarks (those of the elemen- 
tary constituent quark model) and ‘sea’ quarks (qq pairs created virtually). 
Then, in a proton, the u and d quark distributions would be parametrized by 
the sum of valence and sea contributions 


u = uy+qs (9.68) 
EE om (9.69) 


while the antiquark and strange quark distributions are taken to be pure sea 
ūü=d=s=5=45 (9.70) 


where we have assumed that the ‘sea’ is flavour-independent. Such a model 
replaces the six unknown functions now in play by three, and is consequently 
more predictive. The strangeness sum rule (9.63) is now satisfied automati- 
cally, while (9.66) and (9.67) are satisfied by the valence distributions alone: 


| dzruy(z) = 2 (9.71) 
[ dady(x) = 1. (9.72) 
0 


One more important sum rule emerges from the picture of «f;(x) as the 
fractional momentum carried by quark i. This is the momentum sum rule 


: dx [u(x) + u(x) + d(x) + d(x) + s(x) + a(x)| = 1 — e (9.73) 


where e is interpreted as the fraction of the proton momentum that is not 
carried by quarks and antiquarks. The integral in (9.73) is directly related 
to v and Ð cross sections, and its evaluation implies € ~ Ł (the CHARM 
(1981) result was 1 — e = 0.44 4 0.02). This suggests that about half the 
total momentum is carried by uncharged objects. These remaining partons 
are identified with the gluons of QCD. They have their own PDF, g(x). 

An enormous effort, both experimental and theoretical, has gone into de- 
termining the parton distribution functions. The subject is regularly reviewed 


9.3. Partons as quarks and gluons 283 


1.2 


x f(x) 


T 
= T MSTW2008 (NNLO) 
5 12=10 GeV 


MSTW2008 (NNLO) 
12=10,000 GeV? 


g/10 


0.8 


0.6 


0.4 


0.2 


FIGURE 9.9 

Distributions of x times the unpolarized parton distribution functions f(a) 
(where f = uy, dy,ŭ,d,s,c,b,g) and their associated uncertainties using the 
MSTW2008 parametrization (Martin et al. 2009) at a scale u? = 10 GeV? 
and u? = 10,000 GeV. [Figure reproduced courtesy Michael Barnett, for the 
Particle Data Group, from the review of Structure Functions by B F Foster, 
A D Martin and M G Vincter, section 16 in the Review of Particle Physics, 
K Nakamura et al. (Particle Data Group) Journal of Physics G 37 (2010) 
075021, IOP Publishing Limited.] (See color plate I.) 


by the Particle Data Group (currently Nakamura et al. 2010). Figure 9.9 
shows the result of one analysis. In this much more sophisticated approach, 
which includes higher order QCD corrections, it is necessary to specify a par- 
ticular value of Q? (here denoted by Q? = p?) at which the distributions are 
defined, as explained in chapter 15 of volume 2. The distributions at this 
value are quantities to be determined from experiment. The distributions at 
other values of Q? are then predicted by perturbative QCD. 


The main features of the PDFs shown in figure 9.9 are: the valence quark 
distributions are peaked at around x = 0.2, and go to zero for x > 0 and 
x — 1; the sea quarks, on the other hand, have a high probability of carrying 
very low momentum fractions, as do the gluons — in fact, the gluons dominate 
for x below about 0.1. This is then the picture of ‘what nucleons are made 
of’, as revealed by some 40 years of research. 
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FIGURE 9.10 
Drell-Yan process. 
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9.4 The Drell-Yan process 


Much of the importance of the parton model lies outside its original domain of 
deep inelastic scattering. In deep inelastic scattering it is possible to provide 
a more formal basis for the parton model in terms of light-cone and short- 
distance operator expansions (see chapter 18 of Peskin and Schroeder 1995). 
The advantage of the parton formulation lies in the fact that it suggests other 
processes for which a parton description may be relevant but for which formal 
operator arguments are not possible. One such example is the Drell-Yan 
process (Drell and Yan 1970) 


p+poutp +X (9.74) 


in which a uu” pair is produced in proton-proton collisions along with un- 
observed hadrons X, as shown in figure 9.10. The assumption of the parton 
model is that in the limit 


s — œ with T = q°/s finite (9.75) 


the dominant process is that shown in figure 9.11: a quark and antiquark from 
different hadrons are assumed to annihilate to a virtual photon which then 
decays to a uu pair (compare figures 9.3 and 9.4), the remaining quarks 
and antiquarks subsequently emerging as hadrons. 

Let us work in the CM system and neglect all masses. In this case we have 


pi = (P,0,0,P) ps = (P,0,0, —P) (9.76) 
and 
s=4P°. (9.77) 
Neglecting quark masses and transverse momenta, we have quark momenta 
ph, = zı(P,0,0,P) (9.78) 
Pia = %2(P,0,0,—P) (9.79) 
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Py P3 
FIGURE 9.11 
Parton model amplitude for the Drell-Yan process. 
and the photon momentum 
q = Pa F Paz (9.80) 
has non-zero components 
q = (#1 +%2)P (9.81) 
g = (zı —x2)P. (9.82) 
Thus we find 
q? = 421 x2P? (9.83) 
and hence 


oso 


The cross section for the basic process 
qq > utu (9.85) 
is calculated using the result of problem 8.18. Since the QED process 


ete — utp (9.86) 


has the cross section (neglecting all masses) 


a(ete” > wtp) = 4ra? /3q° (9.87) 


we expect the result for a quark of type a with charge e, (in units of e) to be 
(dada > wT) = (4ra? /3q" ez. (9.88) 


To obtain the parton model prediction for proton—proton collisions, one merely 
multiplies this cross section by the probabilities for finding a quark of type a 
with momentum fraction x1, and an antiquark of the same type with fraction 
z2, namely 


da(x1) 421 Ga(r2) dre. (9.89) 
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There is, of course, another contribution for which the antiquark has fraction 
zı and the quark z2: 


Ga(t1) dri Ga(r2) dra. (9.90) 


Thus the Drell-Yan prediction is 


d?o(pp > wtp +X) 
Ara? (9.91) 


N €2[da(#1)Ga(2) + Ga(#1)da(x2)] dey dz2 


a 


where we have included a factor + to account for the colour of the quarks: 


in order to make a colour singlet photon, one needs to match the colours of 
quark and antiquark. Equation (9.91) is the master formula. Its importance 
lies in the fact that the same quark distribution functions are measured in 
deep inelastic lepton scattering so one can make absolute predictions.? For 
example, if the photon in figure 9.11 is replaced by a W(Z), one can predict 
W(Z) production cross sections, as we shall see in volume 2. 

We would expect some ‘scaling’ property to hold for this cross section, fol- 
lowing from the point-like constituent cross section (9.88). One way to exhibit 
this is to use the variables q? and xp = xı — x2 as discussed in problem 9.6. 
There it is shown that the dimensionless quantity 


a o 
q aac (9.92) 
should be a function of zp and the ratio T = q?/s. The data bear out this 
prediction well — see figure 9.12. 

Furthermore, the assumption that the lepton pair is produced via quark— 
antiquark annihilation to a virtual photon can be checked by observing the 
angular distribution of either lepton in the dilepton rest frame, relative to the 
incident proton beam direction. This distribution is expected to be the same 
as in ete” — uty, namely (cf (8.194)) 


da/dQ œ (1 + cos? 0) (9.93) 


as is indeed observed (figure 9.13). Note that figure 9.13 provides evidence 
that the quarks have spin-$: if they are assumed to have spin-0, the angular 


distribution would be (see problem 9.7) proportional to (1 — cos? 0), and this 
is clearly ruled out. 


3QCD corrections make the connection more complicated, but still perturbatively com- 
putable. 
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FIGURE 9.12 

The dimensionless cross section M?d?¢/dMdaxp (M = \/q?) at xp = 0 for 
pN scattering, plotted against yr = M/,/s (Scott 1985): e, \/s = 62 GeV; 
, 44; D, 27.4; O, 23.8. 
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FIGURE 9.13 

Angular distribution of muons, measured in the uty rest frame, relative 
to the incident beam direction, in the Drell-Yan process. (Figure from D 
H Perkins Introduction to High Energy Physics 3rd edn, copyright 1987; 
reprinted by permission of Pearson Education, Inc., Upper Saddle River, NJ.) 


288 9. Deep Inelastic Electron—Nucleon Scattering and the Parton Model 


+ 
e 


X 
(Unobserved 
q hadrons) 


FIGURE 9.14 
ete” annihilation to hadrons in one-photon approximation. 
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9.5 ete~ annihilation into hadrons 


The last electromagnetic process we wish to consider is electron—positron an- 
nihilation into hadrons (figure 9.14): 


ete +X. (9.94) 


As usual, the dominance of the one-photon intermediate state is assumed. 
Figure 9.14 is clearly a generalization of figure 8.9, the latter describing the 
particular case in which the final hadronic state is mt. As a preliminary 
to discussing (9.94), let us therefore revisit ete7 4 ntr" first. 

The O(e?) amplitude is given in equation (8.159). We shall simplify the 
calculation by neglecting both the electron and the pion masses. The spinor 
part of the amplitude is then —20(ki)mu(k), and the ‘L - T’ product is 16(k - 
pi)(ki + pi). Borrowing the general CM cross section formula (6.129) from 
chapter 6 as in (8.121), and including the pion form factor, we obtain for the 
unpolarized CM differential cross section 


do F?(q°)a? 2 


and the total unpolarized cross section is 
2ra? 


= 2 f ye 
fal er 


(9.96) 
The cross section & contains a 1/q? factor, just like that for ete” > pt as 
in (9.87), but this ‘pointlike’ behaviour is modified by the square of the form- 
factor, evaluated at time-like q?. When the measured g is plotted against q? 
for q? < 1 (GeV)?, a pronounced resonance is seen at q? ~ m2, superimposed 
on the smooth 1/q? background, where m, is the mass of the rho resonance 
(J? = 17qq state). The interpretation of this is shown in figure 9.15. F(q?) 
should therefore be parametrized as a resonance, as in (6.107) — or a more 
sophisticated version to take account of the fact that the 7’s are emitted in an 
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FIGURE 9.15 
p-dominance of the pion electromagnetic form factor in the time-like (q? > 0) 
region. 


l = 1 state. Just as F?(q”) modified the point-like cross section in the space- 
like region for e-7*+ —+ e~m*, so here it modifies the point-like (~ 1/q?) 
behaviour in the time-like region. 

Returning now to the process (9.94), the cross section for it is shown as a 
function of CM energy (q?)!/? in figure 9.16. The general point-like fall-off as 
1/q? is seen, with peaks due to a succession of boson resonances superimposed 
(p, J/w, Y, Z°,...). The 1/q? fall-off is suggestive of a (point-like) parton 
picture and indeed the process (9.94) is similar to the Drell-Yan one: 


PpP > pty +X. (9.97) 


It is natural to imagine that at large q? the basic subprocess is quark—antiquark 
pair creation (figure 9.17). The total cross section for qq pair production is 
then (cf (9.88)) 


a(ete” —> qaqa) = (4ra? /3q”)e?. (9.98) 


In the vicinity of mesonic resonances such as the p, we can infer that the 
dominant component in the final state is that in which the qq pair is strongly 
bound into a mesonic state, which then decays into hadrons. Away from res- 
onances, and increasingly at larger values of q?, the produced q and q seek to 
separate from the interaction region. As they draw apart, however, the inter- 
action between them increases (recall section 1.3.6), producing more qq pairs, 
together with radiated gluons. In this process, the coloured quarks and glu- 
ons eventually must form colourless hadrons, since we know that no coloured 
particles have been observed (‘confinement of colour’). If one assumes that 
the presumed colour confinement mechanism does not affect the prediction 
(9.98), then we arrive at the result 


a(ete” — hadrons) = (4ra? /30°) x e (9.99) 


at large q?, where ‘a’ includes all flavours produced at that energy. 
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FIGURE 9.16 

The cross section g for the annihilation process ete~ — hadrons, and the 
ratio R (see equation (9.100)), as a function of cm energy. [Figure reproduced 
courtesy Michael Barnett, for the Particle Data Group, from the Review of 
Particle Physics, K Nakamura et al. (Particle Data Group) Journal of Physics 
G 37 (2010) 075021 IOP Publishing Limited.] (See color plate II.) 


FIGURE 9.17 
Parton model subprocess in ete~ — hadrons. 
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FIGURE 9.18 
Two-jet event in ete annihilation from the TASSO detector at the ete- 
storage ring PETRA. 


This model is best tested by taking out the dominant 1/q? behaviour and 
plotting the ratio 


a(ete” — hadrons) 2 
See?” YD a ———$$_s§« =< ee Ll 
ee 1 e (9.100) 


For the light quarks u, d and s occurring in three colours, we therefore predict 


R = 3[(3)? + (-3)? + (-4)] =2. (9.101) 
Above the c threshold but below the b threshold we expect R = 2. and 
above the b threshold R = H, These expectations are in reasonable accord 


with experiment, especially at energies well beyond the resonance region and 
the b threshold, as figure 9.16 shows. In this figure the dotted curve is the 
prediction of the quark-parton model, equation (9.99). The solid curve in- 
cludes perturbative QCD corrections, which we will return to in chapter 15 of 
volume 2. 

The success of this prediction leads one to consider more detailed con- 
sequences of the picture. For example, the angular distribution of massless 
spin-4 quarks is expected to be (cf (8.194) again) 


da /dQ = (a? /4q°)e? (1 + cos? 0) (9.102) 


just as for the utu“ process. However, in this case there is an important 
difference: the quarks are not observed! Nevertheless a remarkable ‘memory’ 
of (9.102) is retained by the observed final-state hadrons. Experimentally one 
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FIGURE 9.19 

Angular distribution of jets in two-jet events, measured in the two-jet rest 
frame, relative to the incident beam direction, in the process ete7 — two jets 
(Althoff et al. 1984). The full curve is the (1 + cos? 0) distribution. Since it 
is not possible to say which jet corresponded to the quark and which to the 
antiquark, only half the angular distribution can be plotted. The asymmetry 
visible in figure 8.20(b) is therefore not apparent. 


observes events in which hadrons emerge from the interaction region in two 
relatively well-collimated cones or ‘jets’ — see figure 9.18. The distribution 
of events as a function of the (inferred) angle of the jet axis is shown in 
figure 9.19 and is in good agreement with (9.102). The interpretation is that 
the primary process is ete — qq, the quark and the antiquark then turning 
into hadrons as they separate and experience the very strong colour forces, 
but without losing the memory of the original quark angular distribution. We 
shall discuss jets more fully in chapter 14 of volume 2, in the context of QCD. 


E 


Problems 


9.1 The various normalization factors in equations (9.3) and (9.11) may be 
checked in the following way. The cross section for inclusive electron—proton 
scattering may be written (equation (9.11)): 


4ra\? 1 dk’ 
do = | —— | ——— rM L u WH — 9.103 
Q ( @ ) A(k- p)? — m2M2)1/2 TM Lay 2w (27)? ( ) 


in the usual one-photon exchange approximation, and the tensor W“” is re- 
lated to hadronic matrix elements of the electromagnetic current operator by 
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equation (9.3): 


eW" (q,p) 


11 : 
-2 a : `p ch 
An M 2 >, 2 Pip, s| JEn (0)|X; p ) 
x (X; p'|7%n(0)|ps p, 8) (20)404(p +q — p’) 


where the sum X is over all possible hadronic final states. If we consider the 
special case of elastic scattering, the sum over X is only over the final proton’s 
degrees of freedom: 


11 n . 
2 ns i ` E ee NE E aT , 
eWa = GM3? > > (p; p, 5|JEm (0) Ip; P's s) (P; P's s" lJém(0)]p; p, $) 


1 d3p’ 


x (2n)*5*(p+q-p') On) 2B" 


Now use equation (8.208) with Fı = 1 and « = 0 (ie. the electromagnetic 
current matrix element for a ‘point’ proton) to show that the resulting cross 
section is identical to that for elastic ej scattering. 


9.2 


(a) Perform the contraction L,,,W"” for inclusive inelastic electron— 
proton scattering (remember q"Ly, = q“ Luv = 0). Hence verify 
that the inclusive differential cross section in terms of ‘laboratory’ 
variables, and neglecting the electron mass, has the form 


Se R [W> cos? (0/2) + W12 sin? (0/2)] 
= n x 
dQdk’ Akon O ! 


(b) By calculating the Jacobian 


J= ae ae 


Ov/Ox Ov/dOy 
for a change of variables (x, y) + (u, v) 
du dv = |J|dz dy 


find expressions for d?a/dQ? dv and d?a/dxdy, where Q? and v 
have their usual significance, and x is the scaling variable Q?/2Mv 
and y = v/k. 


9.3 Consider the description of inelastic electron—proton scattering in terms 
of virtual photon cross sections: 


(a) In the ‘laboratory’ frame with 


p“ =(M,0,0,0) and q“ =(q°,0,0,q°) 
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evaluate the transverse spin sum 


ES g ARAW. 


A=+1 


Hence show that the ‘Hand’ cross section for transverse virtual pho- 


tons is 
op = (40? a/K)W,. 


(b) Using the definition 


ef = (1/VQ?)(q°, 0,0, q?) 


and rewriting this in terms of the ‘laboratory’ 4-vectors p” and q”, 
evaluate the longitudinal/scalar virtual photon cross section. Hence 
show that 

K Q? 


W = —— = 
4n?a Q? + Vv? 


(os + or). 


9.4 In this problem, we consider the representation of the 4 x 4 Dirac matrices 
in which (see (3.40)) 


(55) a) 


ane ) and the Dirac four-component 


Define also the 4x 4 matrix y5 = C 1 


spinor u = a Then the two-component spinors @, y satisfy 


o:pọ = Ep—mx 
o-px = —-Ex+meo. 


(a) Show that for a massless Dirac particle, ¢ and x become helicity 
eigenstates (see section 3.3) with positive and negative helicity re- 
spectively. 

(b) Defining 

1+7 l-7 

= P, = 
2 2 

show that P: = P? = 1, PRP, = 0 = PLPR, and that Pre + PL =1. 

Show also that 


a(t) aC) 


and hence that PR and P, are projection operators for massless 
Dirac particles, onto states of definite helicity. Discuss what hap- 
pens when m 0. 


PR 


Problems 


(c) The general massless spinor u can be written 


(a) 


(e) 


u= (PL + Pa)u = uL + uR 
where uL, ur have the indicated helicities. Show that 


uy”u = üLq"ur + Gry"ur 
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where uy = uly, up = ub; and deduce that in electromagnetic 


interactions of massless fermions helicity is conserved. 


In weak interactions an axial vector current ty"ys5u also enters. Is 


helicity still conserved? 


Show that the ‘Dirac’ mass term may) may be written as mlo ýRr+ 


dry). 


9.5 In the HERA colliding beam machine, positrons of total energy 27.5 GeV 
collide head on with protons of total energy 820 GeV. Neglecting both the 
positron and the proton rest masses, calculate the centre-of-mass energy in 
such a collision process. 


Some theories have predicted the existence of ‘leptoquarks’, which could 
be produced at HERA as a resonance state formed from the incident positron 
and the struck quark. How would a distribution of such events look, if plotted 
versus the variable x? 


9.6 


(a) By the expedient of inserting a 6-function, the differential cross 
section for Drell-Yan production of a lepton pair of mass \/q? may 


be written as 


da d?o 
— = d — 2 — m 
ie I xı dre TA o(q $£122) 


Show that this is equivalent to the form 


do Ara 


2 
ie = aor fan dag 11 126(41 42 — T) 


x 5 e? [qa(£1)qa(£2) + ga (x1)qa(£2)] 


which, since q? = s7, exhibits a scaling law of the form 


s’da/dq? = F(T). 


296 9. Deep Inelastic Electron—Nucleon Scattering and the Parton Model 
(b) Introduce the Feynman scaling variable 
TF = T1 — T2 


with 
g” = ST1T2 


and show that 
dq? dap = (£1 + x2)sdzı dz2. 


Hence show that the Drell-Yan formula can be rewritten as 


d?o Ara? T = 1 
ida Og G2 +42 >D ealda(£1)a (£2) + ga (z1)qa(x2)]. 
Ẹ a 


9.7 Verify that if the quarks participating in the Drell-Yan subprocess qq —> 
y — pj had spin-0, the CM angular distribution of the final u* u` pair would 
be proportional to (1 — cos? 0). 


Part IV 


Loops and Renormalization 


297 


This page intentionally left blank 


10 


Loops and Renormalization I: The ABC 
Theory 


We have seen how Feynman diagrams represent terms in a perturbation theory 
expansion of physical amplitudes, namely the Dyson expansion of section 6.2. 
Terms of a given order all involve the same power of a ‘coupling constant’, 
which is the multiplicative constant appearing in the interaction Hamiltonian 
~ for example, ‘g’ in the ABC theory, or the charge ‘e’ in electrodynamics. In 
practice, it often turns out that the relevant parameter is actually the square 
of the coupling constant, and factors of 47 have a habit of appearing on a 
regular basis; so, for QED, the perturbation series is conveniently ordered 
according to powers of the fine structure constant a = e? /4r ~ 1/187. 


Equivalently, this is an expansion in terms of the number of vertices ap- 
pearing in the diagrams, since one power of the coupling constant is associated 
with each vertex. For a given physical process, the lowest-order diagrams (the 
ones with the fewest vertices) are those in which each vertex is connected 
to every other vertex by just one internal line; these are called tree diagrams. 
The Yukawa (u-channel) exchange process of figure 6.4, and the s-channel pro- 
cess of figure 6.5, are both examples of tree diagrams, and indeed all of our 
calculations so far have not gone further than this lowest-order (‘tree’) level. 
Admittedly, since a is after all pretty small, tree diagrams in QED are likely 
to give us a good approximation to compare with experiment. Nevertheless, a 
long history of beautiful and ingenious experiments has resulted in observables 
in QED being determined to an accuracy far better than the O(1%) repre- 
sented by the leading (tree) terms. More generally, precision experiments at 
LEP and other laboratories have an accuracy sensitive to higher-order cor- 
rections in the Standard Model. Hence, some understanding of the physics 
beyond the tree approximation is now essential for phenomenology. 


All higher-order processes beyond the tree approximation involve loops, a 
concept easier to recognize visually than to define in words. In section 6.3.5 
we already met (figure 6.8) one example of an O(g*) correction to the O(g?) 
C-exchange tree diagram of figure 6.4, which contains one loop. The crucial 
point is that whereas a tree diagram can be cut into two separate pieces by 
severing just one internal line, to cut a loop diagram into two separate pieces 
requires the severing of at least two internal lines. 


In these last two chapters of volume 1, we aim to provide an introduc- 
tion to higher-order processes, confining ourselves to ‘one-loop’ order. In the 
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FIGURE 10.1 
O(g*) contribution to the process A + B > A+B, involving the modification 
of the C propagator by the insertion of a loop. 


present chapter we shall concentrate mainly on the particular loop appearing 
in figure 6.8. This will lead us into the physics of renormalization for the ABC 
theory, which — as a Yukawa-like theory — is a good theoretical laboratory for 
studying ‘one-loop physics’, without the complications of spinor and gauge 
fields. In the following chapter, we shall discuss one-loop diagrams in QED, 
emphasizing some important physical consequences, such as corrections to 
Coulomb’s law, anomalous magnetic moments and the running coupling con- 
stant. 


E: SSe 


10.1 The propagator correction in ABC theory 
10.1.1 The O(g°) self-energy OË! (q?) 


We consider figure 6.8, reproduced here again as figure 10.1. In section 6.3.5, 
we gave the extra rule (‘(iii)’) needed to write down the invariant amplitude 
for this process. We first show how this rule arises in the special case of 
figure 10.1. 

Clearly, figure 10.1 is a fourth-order process, so it must emerge from the 


term 
(—ig)* 4 4 4 4 A ys 7 
d*x, d*x d z3 d*x4 (0|âa (p'a GB (pp) 


4! 
x T{ĝa(zı)ĝB(z1)ĝc(21).. . a (wa) on (z4 )ĝc(z4)} 
x â] (pa)al, (pe)|0)(16Ea Es Ei Eh) "? (10.1) 
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of the Dyson expansion. Since it is basically a u-channel exchange process 
(u = (pa — pp)? = (ph — pp)?), the vev’s involving the external creation and 
annihilation operators must appear as they do in equation (6.89) (‘ingoing 
A, outgoing B’ at one point x2; ingoing B, outgoing A’ at another point x1’) 
rather than as in equation (6.88) (‘ingoing A and B at 22; outgoing A’ and 
B’ at x’). In (10.1), however, we unfortunately have four space-time points 
to choose from, rather than merely the two in (6.74). Figuring out exactly 
which choices are in fact equivalent and which are not is best left to private 
struggle, especially since we are not seriously interested in the numerical value 
of our fourth-order corrections in this case. Let us simply consider one choice, 
analogous to (6.89). This yields the amplitude (cf (6.91)) 


(—ig)* gel dfzı dfx dfx diza ei(Pa—Pe) t1 gi(Pp—Pa)-t2 


x (O/T {ġc(z1)ĝc(z2)ĝa (#3) bx (z3)c(z3)ĝa (x4) OB (xa)ĝc(x4)}10) 

(10.2) 

and we have discarded the numerical factor 1/4!. Once again, there are many 
terms in the expansion of the vev of the eight operators in (10.2). But, with 


an eye on the structure of the Feynman amplitude at which we are aiming 
(figure 10.1), let us consider again just a single contribution 


(—ig) fff fare dzz d*z3 dary ei(Pa—PB)-*1 gi(Pp—Pa)-x2 


x (0|T (Go(a1)dc(#3))|0)(0|T (Go(a2)dc(w4))|0) 
x (0|T (ĝa (z3)ĝa (w4))|0) (0|T ($B (ws) $n (w4))|0) (10.3) 
which contains four propagators connected as in figure 10.2. 
As we saw in section 6.3.2, each of these propagators is a function only 
of the difference of the two space-time points involved. Introducing relative 


aan £z = T1 — T3, Y = T2 — T4, Z = £3 — T4 and the CM coordinate 
X = 4(x1 + £2 + £3 + x4), we find (problem 10.1) that (10.3) becomes 


(—ig) y [f| 2x a'zdtydtzetatrs- PA— pe) X eia- pe) (3a— y+2z)/4 
x el(Ph-pa)(-2489-22)/4 Dol) Dey) Da(z)Da(2) (10.4) 


where D; is the position-space propagator for type-i particles (i = A,B,C), 
defined as in (6.98). The integral over X gives the expected overall 4-momentum 
conservation factor, (27)*6+(p, +ph—pa— pse). Setting q = pa — ph = P’ — PB 
(where 4-momentum conservation has been used), (10.4) becomes 


(—ig)*(27)*64(p, + ph — pa — DB) H dtz dy dtz íT” De(x) 
x e249 Do(y)ėt* Da (z)Dg(z). (10.5) 
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FIGURE 10.2 
The space-time structure of the integrand in (10.3). 


The integrals over x and y separate out completely, each being just the 
Fourier transform of a C propagator — that is, the momentum-space prop- 
agator Dc (q). Since the latter is a function of q? only, we end up with two 
factors of i/(q? — m2, + ie), corresponding to the two C propagators in the 
momentum-space Feynman diagram of figure 10.1. Note that the Mandel- 
stam u-variable is defined by u = (pa — ph)? and is thus equal to q?; we shall, 
however, continue to use q? rather than u in what follows. 

The remaining factor represents the loop. Including (—ig)? for the two 
vertices in the loop, it is given by 


(ig)? f atzet* Dale) Dele) (10.6) 
which is the main result of our calculation so far. Since we want to end 


up finally with a momentum-space amplitude, let us introduce the A and B 
propagators in momentum space, and write (10.6) as (cf (6.99)) 


dtkı _. i dtky _. i 
= 4 1 ~iky-z 2  —iko-z 
ig) - ze" “o mr zoa) om k2 — m + ie 


io? [fo aac dfkə i i 
4 (Qn)4 k? — m2 4+ ie k? — m2, + ie 


x (27)*8 (kı + A —4q) 
4 i i 
= (-ig)* J ERRE BFORESE (10.7) 


2n)* k2? — m3 + ie (q — k)? — m2 + ie 


= ing), (10.8) 
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where we have defined the function im?! (q?) as the loop (or ‘bubble’) am- 
plitude appearing in figure 10.1. It is a function of g?, as follows from Lorentz 
invariance. The Pl refers to the two powers of g, as will be explained shortly, 
after (10.15). 

Careful consideration of the equivalences among the various contractions 
shows that the amplitude corresponding to figure 10.1 is, in fact, just the 
simple expression 
(—ig)?(2n)*5*(p'y + Bs — Pa — PB) ——-(—ill|(@)) >$— - —— 

q? — m2, + ie E q? — m2, + ie 


(10.9) 
where nË! (q?) is given in (10.8). We see that whereas the ‘single-particle’ 
pieces, involving one C-exchange, do not involve any integral in momentum- 
space, the loop (which involves both A and B particles) does involve a momen- 
tum integral. This can be simply understood in terms of 4-momentum conser- 
vation, which holds at every vertex of a Feynman graph. At the top (or bot- 
tom) vertex of figure 10.1, the 4-momentum q of the C-particle is fully deter- 
mined by that of the incoming and outgoing particles (q = pa—p = P’ — PB). 
This same 4-momentum q flows in (and out) of the loop in figure 10.1, but 
nothing determines how it is to be shared between the A- and B-particles; 
all that can be said is that if the 4-momentum of A is k (as in (10.7)) then 
that of B is q — k, so that their sum is q. The ‘free’ variable k then has to be 
integrated over, and this is the physical origin of rule (iii) of section 6.3.5. 

We have devoted some time to the steps leading to expression (10.7), not 
only in order to follow the emergence of rule (iii) mathematically, but so as to 
lend some plausibility to a very important statement: the Feynman rules for 
associating factors with vertices and propagators, which we learned for tree 
graphs in chapters 6 and 8, also work, with the addition of rule (iii), for all 
more complicated graphs as well! Having seen most of just one fairly short 
calculation of a higher-order amplitude, the reader may perhaps now begin to 
appreciate just how powerful is the precise correspondence between ‘diagrams 
and amplitudes’, given by the Feynman rules. 

Having arrived at the expression for our first one-loop graph, we must 
at once draw the reader’s attention to the bad news: the integral in (10.7) is 
divergent at large values of k. We shall postpone a more detailed mathematical 
analysis until section 10.3.1, but the divergence can be plausibly inferred just 
from a simple counting of powers: there are four powers of k in the numerator 
and four in the denominator, and the likelihood is that the integral diverges 
as fe k3dk/k* ~ nA, as A > oo. This is plainly a disaster: a quantity 
which was supposed to be a small correction in perturbation theory is actually 
infinite! Such divergences, occurring as loop momenta go to infinity, are called 
‘ultraviolet divergences’, and they are ubiquitous in quantum field theory. 
Only after a long struggle with these infinities was it understood how to obtain 
physically sensible results from such perturbation expansions. Depending on 
the type of field theory involved, the infinities can often be ‘tamed’ through a 
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procedure known as renormalization, to which we shall provide an introduction 
in this and the following chapter. 

The physical ideas behind renormalization are, however, just as relevant 
in cases — such as condensed matter physics — where the analogous higher- 
order (loop) corrections are not infinite, though possibly large. In quantum 
mechanics, infinite momentum corresponds to zero distance, and our fields 
are certainly ‘point-like’. But in condensed matter physics there is generally a 
natural non-zero smallest distance — the lattice size, or an atomic diameter, for 
example. In quantum field theory, such a ‘shortest distance’ would correspond 
to a ‘highest momentum’, meaning that the magnitudes of loop momenta 
would run from zero up to some finite limit A, say, rather than infinity. Such 
a A is called a (momentum) ‘cut-off’. With such a cut-off in place, our loop 
integrals are of course finite — but it would seem that we have then maltreated 
our field theory in some way. However, we might well ask whether we seriously 
believe that any of our quantum field theories is literally valid for arbitrarily 
high energies (or arbitrarily small distances). The answer is surely no: we are 
virtually certain that ‘new physics’ will come into play at some stage, which is 
not contained in — say — the QED, or even the Standard Model, Lagrangian. 
At what scale this new physics will enter (the Planck energy? 1 TeV?) we 
do not know, but surely the current models will break down at some point. 
We should not be too alarmed, therefore, by formal divergences as A — oo. 
Rather, it may be sensible to regard a cut-off A as standing for some ‘new 
physics’ scale, accepting some such manoeuvre as physically realistic as well 
as mathematically prudent. 

At the same time, however, we would not want our physical predictions, 
made using quantum field theories, to depend sensitively on A — i.e. on the 
unknown short-distance physics, in this interpretation. Indeed, theories exist 
(for example, those in the Standard Model and the ABC theory) which can be 
reformulated in such a way that all dependence on A disappears, as A — oo; 
these are, precisely, renormalizable quantum field theories. Roughly speaking, 
a renormalizable quantum field theory is one such that, when formulae are 
expressed in terms of certain ‘physical’ parameters taken from experiment, 
rather than in terms of the original parameters appearing in the Lagrangian, 
calculated quantities will be finite and independent of A as A > oo. 

Solid state physics provides a close analogy. There, the usefulness of a 
description of, say, electrons in a metal in terms of their ‘effective charge’ and 
‘effective mass’, rather than their free-space values, is well established. In this 
analogy, the free-space quantities correspond to our Lagrangian values, while 
the effective parameters correspond to our ‘physical’ ones. In both cases, the 
interactions are causing changes to the parameters. 

It is clear that we need to understand more precisely just what our ‘physi- 
cal parameters’ might be and how they might be defined. This is what we aim 
to do in the remainder of the present section, and in the next one, before re- 
turning in section 10.3 to the mathematical details associated with evaluating 
(10.7), and indicating how renormalization works for the self-energy. Having 
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FIGURE 10.3 
O(g°) term in A+ B —> A +B, involving the insertion of two loops in the C 
propagator. 


thus prepared the ground, we shall introduce a more powerful approach in 
section 10.4, and offer a few preliminary remarks about ‘renormalizability’ 
in section 10.5, returning to that topic at the end of the following chapter. 
Although usually not explicitly indicated, loop corrections considered in this 
and the following section will be understood to be defined with a cut-off A, 
so that they are finite. 


To begin the discussion of the physical significance of our O(g*) correction, 
(10.9), it is convenient to consider both the O(g?) term (6.100) and the O(g*) 
correction together, obtaining 


(—ig)? (27)*64(p', + ph — pa — pe) 


i 1 72] 2 
—ill —_—— 10.1 
x fet eae) a} (10.10) 


i 


where the ie in the C propagators does not need to be retained. Both the 
form of (10.10), and inspection of figure 10.1, suggest that the O(g*) term 
we have calculated can be regarded as an O(g?) correction to the propagator 
for the C-particle. Indeed, we can easily imagine adding in the O(g®) term 
shown in figure 10.3, and in fact the whole infinite series of such ‘bubbles’ 
connected by simple C propagators. The infinite geometric series for the 
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FIGURE 10.4 
Series of one-loop (or ‘bubble’) insertions in the C propagator. 


Qy 


corrected propagator shown in figure 10.4 has the form 


i i [2]; 2 i 
= + (ille (q )) ——— 
q? — ma, poe c ( Nae 
in? 2 l _ ql! 2 l 
T Hio (q Dre Lio (q me + 
(10.11) 
=- zl +r+r? +) (10.12) 
q — me 
where 
2 
r= 0h) — m2). (10.13) 


The geometric series in (10.12) may be summed, at least formally!, to give 
(1—r)~} so that (10.12) becomes 
i 1 i 
SS E 10.14) 
2 _ m2 2 2 ( 
g = mG 1 —T1E|(q?)/(@ — me) a? — mz, — 1G (4?) 
In this form it is particularly clear that we are dealing with corrections to the 
simple C propagator i/(q? — m&). nË is called the O(g?) self-energy. 
Before proceeding with the analysis of (10.14), we note that it is a special 
case of the more general expression 
DP) = s (10.15) 
qf — mgo To(q?) 
where Do(@) is the complete (including all corrections) C propagator, and 
IIc(q?) is the sum of all ‘insertions’ in the C line, excluding those which 
can be cut into two separate bits by severing a single line: IIc(q?) is the 
one-particle irreducible self-energy and we must exclude all one-particle bits 
from it as they are already included in the geometric series summation (cf 


(10.11)). The amplitude n? which we have calculated is simply the lowest- 
order (O(g?)) contribution to He(q?); an O(g*) contribution to He(q?) is 
shown in figure 10.5. 

1Properly speaking this is valid only for |r| < 1, yet we know that 1 (q?) actually 
diverges! As we shall see, however, renormalization will be carried out after making such 


quantities finite by ‘regularization’ (section 10.3.2), and then working systematically at a 
given order in g (section 10.4). 
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A B 


FIGURE 10.5 
O(g*) contribution to Hc(q?). 


10.1.2 Mass shift 


We return to the expression (10.14) which includes the effect of all the iterated 
O(g?) bubbles in the C propagator, where nË (@?) is given by 


aml (g?) = (ig)? [ SH 10.16 
—illé (4°) = (ig) l Eo (10.16) 
Postponing the evaluation of (10.16) (and in particular the treatment of its 
divergence) until section 10.3, we proceed to discuss the further implications 
of (10.14). 

First, suppose nË were simply a constant, dm2, say. In the absence of this 
correction, we know (cf section 6.3.3) that the vanishing of the denominator 
of the C propagator would correspond to the ‘mass-shell condition’ q? = m 
appropriate to a free particle of momentum q and energy go = (q? + m2,)/?, 
where mc is the mass of a C particle. It seems very plausible, therefore, 
to interpret the constant 6m2 as a shift in the (mass)? of the C particle, 
the denominator of (10.14) now vanishing at go = (q? + m2, + dm2)!/?, if 
2! ~ dm2,. The idea that the mass of a particle can be changed from its ‘free 
space’ value by the presence of interactions with its ‘environment’ is a familiar 
one in condensed matter physics, as noted above. In the case of electrons in 
a metal, for example, it is not surprising that the presence of the lattice ions, 
and the attendant band structure, affect the response of conduction electrons 
to external fields, so that their apparent inertia changes. In the present case, 
the ‘environment’ is, in fact, the vacuum. The process described by the bubble 
nË (q?) is one in which a C particle dissociates virtually into an A-B pair, 
which then recombine into the C particle, no other ‘external’ source being 
present. As in earlier uses of the word, by ‘virtual’ here is meant a process in 
which the participating particles leave their mass-shells. Thus, in particular, 
in the expression (10.16) for nË, it will in general be the case that k? 4 m4, 
and (q — k)? 4 m2. 

In the case of the electron in a metal, both the ‘free’ and the ‘effective’ 
masses are measurable quantities. But we cannot get outside the vacuum! 


308 10. Loops and Renormalization I: The ABC Theory 


This strongly suggests that what we must mean by ‘the physical (mass)? of 
a particle in our ABC theory is not the ‘free’ (Lagrangian) value m?, which 
is unmeasurable, but the effective (mass)? which includes all vacuum inter- 
actions. This ‘physical (mass)? may be defined to be that value of q? for 
which 
@ — m? —II;(q’) = 0 (10.17) 
where T; (q?) is the complete one-particle irreducible self-energy for particle 
type ‘i’. If we call the physical mass mpn,;, then, we will have q? — m? — 
IL; (q?) = 0 when g? = Mpi 
What we are dealing with in (10.14) is just the lowest-order contribution 
to Ic(q?), namely 2! (q°), so that in our case m®, œ is determined by the 
condition 
gq? —me— 2! (g y=0 when q? = manor (10.18) 


which (to this order) is 
2 
mro = me + OÉ! (m3,,c)- (10.19) 


Once we have calculated me (see les 10.3), equation (10.19) could A 
regarded as an equation to deienaine mer, c in terms of the parameter ree 
which appeared in the original ABC Lagrangian. This might, indeed, be the 
way such an equation would be viewed in condensed matter physics, where we 
should know the values of the parameters in the Lagrangian. But in the field- 
theory case me is unobservable, so that such an equation has no predictive 
value. Instead, we may regard it as an equation determining (up to O(g?)) 
mê in terms of m2), cœ thus enabling us to eliminate — to this order in g ~ all 
occurrences of the unobservable parameter mé from our amplitudes in favour 
of the physical parameter mên c: Note that nË! contains two powers of g, so 
that in the spirit of systematic perturbation theory, the mass shift represented 
by (10.19) is a second-order correction. 

The crucial point here is that n? depends on the cut-off A, whereas the 
physical mass marc clearly does not. But there is nothing to stop us suppos- 
ing that the unknown and unobservable Lagrangian parameter m2, depends 
on A in just such a way as to cancel the A-dependence of n? , leaving menc 
independent of A. This is the beginning of the ‘renormalization procedure’ in 
quantum field theory. 


10.1.3 Field strength renormalization 


We now need to consider the more realistic case in which nË! (q?) is not a 
constant. Let us expand it about the point q? = m*,, c, writing 


am! 
TEN?) = HEE (mao) +- mono) 2 ++. (10.20) 


2m2 
g =m he 
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The corrected propagator (10.14) then becomes 


i 


r - - NA (10.21) 
q — — Ié (ms Mph, c)— (4 - Mpn,c) dg? oe ai 
PHM on Cc 
= - : (10.22) 
2 2 amt! 2 2 2 
(q = Mon,C) 1— 2 T O(g ai mõn,c) 
dq Pam, a 


The expression (10.22) has indeed the opened form for 2 yaa C’ propa- 
gator, having the simple behaviour ~1/(q? — men, c) for gq? = m3, cc: However, 
the normalization of this (corrected) propagator is different from that of the 
‘free’ one, i/(q? — m&), because of the extra factor 


[2] = 
i Eo | | 
P=M oC 


dq? 
To the order at which we are working (O(g?)), it is consistent to replace this 
expression by 


P=Mn,C 
Let us see how this factor may be understood. 

Our O(g?) corrected propagator is an approximation to the exact propaga- 
tor which we may write as (Q|T(¢c(21)¢c(x2))|Q), in coordinate space, where 
IQ) is the exact vacuum. The free propagator, however, is (0|T'(dc(#1)c(x2))|0) 
as calculated in section 6.3.2. Consider one term in the latter, 0(t, — t2)x 
(0|éc(x1)¢c(a2)|0), and insert a complete set of free-particle states ‘1 = 
Xn n)(n|’ between the two free fields, obtaining 


A(t — t2) })(0ldc(a1)|n)(n|ec(w2)|0). (10.23) 


n 


The only free particle state |n) having a non-zero matrix element of the free 
field ġc to the vacuum is the 1—C state, for which (0|¢c(zx)|C, k) = e7!** as 
we learned in chapters 5 and 6. Thus (10.23) becomes (cf equation (6.92)) 


dk iw HK- 
at — ta) | Same (tata) Hk (21-22) (10.24) 


which is exactly the first term of equation (6.92). Consider now carrying out a 
similar manipulation for the corresponding term of the interacting propagator, 
obtaining 


Olti — t2) X (OUlbo len) (n]dc(w2)|Q) (10.25) 


n 
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where the states |n) are now the exact eigenstates of the full Hamiltonian. The 
crucial difference between (10.23) and (10.25) is that in (10.25), multi-particle 
states can appear in the states Jn). For example, the state |A, B} consisting 
of an A particle and a B particle will enter, because the interaction couples 
this state to the 1-C states created and destroyed in dc: indeed, just such an 
A+B state is present in u?) This means that, whereas in the free case the 
‘content’ of the state (0|¢c(«) was fully exhausted by the 1 — C state |C, k) 
(in the sense that all overlaps with other states |n) were zero), this is not so 
in the interacting case. The ‘content’ of (Q|¢c(x) is not fully exhausted by 
the state |C, k): rather, it has overlaps with many other states. Now the sum 


total of all these overlaps (in the sense of ‘$, |n) (n|’) must be unity. Thus 


it seems clear that the ‘strength’ of the single matrix element (Q|éc(a)|C, k) 
in the interacting case cannot be the same as the free case (where the single 
state exhausted the completeness sum). However, we expect it to be true that 
(Q|dc(a)|C, k} is still basically the wavefunction for the C-particle. Hence we 
shall write 


(Q\dc(a)|C, k} = VZce"** (10.26) 


where Zc is a constant to take account of the change in normalization — 
the renormalization, in fact — required by the altered ‘strength’ of the matrix 
element. 

If (10.26) is accepted, we can now imagine repeating the steps leading from 
equation (6.92) to equation (6.98) but this time for (Q\T(¢a(x1)dc(x2))|Q), 
retaining explicitly only the single-particle state |C, k} in (10.25), and using 
the physical (mass)?, mo We should then arrive at a propagator in the 
interacting case which has the form 


x ; d*k iZ 
(QI Gotes)éo(es))}9) =f ae ee 
ph, 


+ multiparticle contributions} . (10.27) 


The single-particle contribution in (10.27) — after undoing the Fourier trans- 
form — has exactly the same form as the one we found in (10.22), if we identify 
the field strength renormalization constant Zc with the proportionality factor 
in (10.22), to this order: 


am} 
dq? q= 


Zor Z =1+ (10.28) 


mõn,c 
This is how the change in normalization in (10.22) is to be interpreted. 

It may be helpful to sketch briefly an analogy between this ‘renormaliza- 
tion’ and a very similar one in ordinary quantum mechanical perturbation 
theory. Suppose we have a Hamiltonian H = Hp) + V and that the |n) are 
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a complete set of orthonormal states such that Ho|n) = EO In). The exact 
eigenstates |n) satisfy —_ = 
(Ho + V)|n) = E,,|n). (10.29) 


To obtain |n) and En in perturbation theory, we write 


= Jain) + Y cinli) (10.30) 


where, if |n) is also normalized, we have 
1= Na+ >, |in}? (10.31) 


N,, cannot be unity, since non-zero amounts of the states |i) (i 4 n) have been 
‘mixed in’ by the perturbation- just as the A + B state was introduced into 
the summation ‘>, |n) (n]’, in addition to the 1 — C state. Inserting (10.30) 
into (10.29) and taking the bracket with (j| yields 


(j|VIn) 


10.32) 
©) ( 
EO _ En 


jn = 


which is still an exact expression. The lowest non-trivial approximation to 
Cjn is to take Jn) = VN, |n) and Ep ~ EW in (10.32), giving 


ey aed ee (10.33) 


"BO — EO EO” MgO- BO pO 


Equation (10.31) then gives Nn as 


Mai (1+5 Va PEP - Bo P) s1- E MaE - EO)? 
j 


(10.34) 
to second order in V;,,. The reader may ponder on the analogy between (10.34) 
and (10.28). 


10.2 The vertex correction 


At the same order (g*) of perturbation theory, we should also include, for 
consistency, the processes shown in figures 10.6(a) and (b). Figure 10.6(a), 
for example, has the general form 


>n f aa 
9 P m (—igG?l(pa, pp)) (10.35) 
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FIGURE 10.6 
O(g*) contributions to A+ B —> A + B, involving corrections to the ABC 
vertices in figure 6.4. 


where —igG! is the ‘triangle’ loop, given by an expression similar to (10.16) 
but with a factor (—ig)? and three propagators. The ‘vertex correction’ Gl?! 
depends on just two of its external 4-momenta because the third is determined 
by 4-momentum conservation, as usual. Thus, the addition of figure 10.6(a) 
and the O(g?) C-exchange tree diagram gives 


. i : : 
—ig-——y {-ig + (—igG?!(pa, pp))} (10.36) 
q? — Moa 


from which it seems plausible that GP! will contribute — among other effects 
—toachange in g. This change will be of order g?, since we may write the 
{...} bracket in (10.36) as 


—ig{1 + GP! (pa, ph)} (10.37) 


where GPI is dimensionless and contains a g? factor — hence the superscript 
[2]. 

Once again, the effect of interactions with the environment (i.e. vacuum 
fluctuations) has been to alter the value of a Lagrangian parameter away from 
the ‘free’ value. In the case of g the change is analogous to that in which an 
electron in a metal acquires an ‘effective charge’. How we define the ‘physical 
g’ is less clear than in the case of the physical mass and we shall not pursue 
this point here, since we shall discuss it again in the more interesting case of 
the charge ‘e’ in QED, in the following chapter. At all events, some suitable 
definition of ‘gpn’ can be given, so that it can be related to g after the relevant 
amplitudes have been computed. 

Let us briefly recapitulate progress. We are studying higher-order (one- 
loop) corrections to tree graph amplitudes in the ABC model, which has the 
Lagrangian density: 


Ê= S~{40,4:0"4; — 4m?4?} — gbadade. (10.38) 
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(a) (b) 


FIGURE 10.7 
Elementary one-loop amplitudes: (a) self-energy; (b) vertex correction. 


We have found that the loops considered so far, namely those in figures 10.1 
and 10.5, have the following qualitative effects: 


(i) the position of the single-particle mass-shell condition becomes shifted 
away from the ‘Lagrangian’ value m? to a ‘physical’ value m?,, ; 
given by the vanishing of an expression such as (10.17); 


(ii) the vacuum-to-one-particle matrix elements of the fields ¢; have to 
be ‘renormalized’ by a factor /Zj, given by (10.28) to O(g?) for 
i=C, and these factors have to be included in S-matrix elements; 


(iii) the propagators contain some contribution from two-particle states 
(e.g. ‘ A + B’ for the C propagator); 


(iv) the Lagrangian coupling g is shifted by the interactions to a ‘phys- 
ical’ value gpn- 


Responsible for these effects were two ‘elementary’ loops, that for —ill?] shown 
in figure 10.7(a) and that for —igG?! shown in figure 10.7(b). It is noteworthy 
that the effects (i), (ii) and (iv) all relate to changes (renormalizations, shifts) 
in the fields and parameters of the original Lagrangian. We say, collectively, 
that the ‘fields, masses and coupling have been renormalized’ — i.e. generi- 
cally altered from their ‘free’ values, by the virtual interactions represented 
generically by figures 10.7(a) and (b). However, whereas in condensed matter 
physics one might well have the ambition to calculate such effects from first 
principles, in the field-theory case that makes no sense. Rather, by rewriting 
all calculated expressions (at a given order of perturbation theory) in terms 
of ‘renormalized’ quantities, we aim to eliminate the ‘unknown physics scale’, 
A, from the theory. Let us now see how this works in more mathematical 
detail. 


314 10. Loops and Renormalization I: The ABC Theory 


10.3 Dealing with the bad news: a simple example 
10.3.1 Evaluating 112) (q?) 


We turn our attention to the actual evaluation of a one-loop amplitude, be- 
ginning with the simplest, which is -inh (@): 


-in ( 2) = (-i yf Ék i o i (10.39) 
ow v (27)4 k? — mł + ie (q — k)? — m3 + ie’ f 

in particular, we want to know the precise mathematical form of the divergence 
which arises when the momentum integral in (10.39) is not cut off at an upper 
limit A. This will necessitate the introduction of a few modest tricks from a 
large armoury (mostly due to Feynman) for dealing with such integrals. 

The first move in evaluating (10.39) is to ‘combine the denominators’ using 
the identity (problem 10.2) 


1 1 dz 
=|, (Ges oe 


(similar ‘Feynman identities’ exist for combining three or more denominator 
factors). Applying (10.40) to (10.39) we obtain 


4 
-in Pfa o f oa 


1 
— (10.41 
* Ta) — mi +e) ag bP me iE OP 
Collecting up terms inside the [...] bracket and changing the integration vari- 
able to k’ = k — xq leads to (problem 10.3) 
ad 
T im) = 10.42 
1 =g fa =| om (k2 — A + ie)? L J ie)? ( ) 
where 
A = —r(1 — x)? + rmẸ + (1 -— r)må. (10.43) 


The d*k’ integral means dk” d3k', and k’? = (k)? — k”. 

We now perform the k’° integration in (10.42) for which we will need the 
contour integration techniques explained in appendix F. The integral we want 
to calculate is 


foe) dk” fa) oo k’? fa) 
[weap al esata) ay 
where A = k? + A — ie. We rewrite I(A) as 


I(A) = im f aoa (10.45) 


R—-0o 
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Imk” 


FIGURE 10.8 
Location of the poles of (10.42) in the complex k’°-plane. 


where the contour Cr is the real axis from —R to R. Next, we identify the 
points where the integrand [z? — A]~! ceases to be analytic (called ‘poles’), 
which are at z = +VA = (k? +A ~—ie)!/?. Figure 10.8 shows the location of 
these points in the complex z(k’°)-plane: note that the ‘ie’ determines in which 
half-plane each point lies (compare the similar role of the ‘ie’ in (z+ie)~+, in the 
proof in appendix F of the representation (6.93) for the 6-function). We must 
now ‘close the contour’ in order to be able to use Cauchy’s integral formula 
of (F.19). We may do this by means of a large semicircle in either the upper 
(C+) or lower (C_) half-plane (again compare the discussion in appendix F). 
The contribution from either such semicircle vanishes as R — oo, since on 
either we have z = Re’, and 


dz Re'®i dd 
n oc. 2? — A =] eng 0 as R > œ. (10.46) 


For definiteness, let us choose to close the contour in the upper half-plane. 
Then we are evaluating 


À dz 
M= = C=Crtc, (z — VA (z + VA) poan 


around the closed contour C shown in figure 10.9, which encloses the single 
non-analytic point at z = — VA. Applying Cauchy’s integral formula (F.19) 
with a = — VA and f(z) = (z — VA)71, we find 


I(A) = 2ri (10.48) 


pi 
—2VA 
and thus 


Š dk’? Ti 
L [(k)2 — AJ? 24372" ue 
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FIGURE 10.9 
The closed contour C used in the integral (10.47). 


The reader may like to try taking the other choice (C_) of closing contour, 
and check that the answer is the same. Reinstating the remaining integrals in 
(10.42) we have finally (as € > 0) 


uz 
[2] du 
—illé (@ = y efa J FA 72 4 A\S/2 (10.50) 


where u = |k’| and the integration over the angles of k’ has yielded a factor 
of 47. We see that the u-integral behaves as f du/u for large u, which is 
logarithmically divergent, as expected from the start. 


10.3.2 Regularization and renormalization 


Faced with results which are infinite, one can either try to go back to the 
very beginnings of the theory and see if a totally new start can avoid the 
infinities or one can see if they can somehow be ‘lived with’. The first approach 
may yet, ultimately, turn out to be correct: perhaps a future theory will be 
altogether free of divergences (such theories do in fact exist, but none as yet 
successfully describes the pattern of particles and forces we actually seem to 
have in Nature). For the moment, it is the second approach which has been 
pursued — indeed with great success as we shall see in the next chapter and 
in volume 2. 

Accepting the general framework of quantum field theory, then, the first 
thing we must obviously do is to modify the theory in some way so that 
integrals such as (10.50) do not actually diverge, so that we can at least discuss 
finite rather than infinite quantities. This step is called ‘regularization’ of the 
theory. There are many ways to do this but for our present purposes a simple 
one will do well enough, which is to cut off the u-integration in (10.50) at some 
finite value A (remember u is |k’|, so A here will have dimensions of energy, 
or mass); such a step was given some physical motivation in section 10.1.1. 
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Then we can evaluate the integral straightforwardly and move on to the next 
stage. 

With the upper limit in (10.50) replaced by A, we can evaluate the u- 
integral, obtaining (problem 10.4) 


_ 2 pl 2 1/2 
[2,2 42) g A+ (A* +4) A 
To A) = ga f, (Soe ~ Tapes 208) 


where from (10.43) 
A =-—2x(1—2)q? + rmf + (1 -— r)må. (10.52) 


Note that A > 0 for q? < 0. 

Inspection of (10.51) shows that as A > oo, u? (@?, A?) contains a diver- 
gent part proportional to ln A. It is useful to isolate this divergent part, as 
follows. For large A, we can expand the terms in (10.51) in powers of A/A?, 
writing 

A 
A+ (A? + A)? = 2A(1 + ae) (10.53) 
and 


A A 
Cana - oa" (10.54) 


It follows that 


2 pl 
z 1 
nP (q?, A?) = = dz [ma + m2- DES sina} (10.55) 
T 0 2 
where terms that go to zero as A —> oo have been omitted. 
Relation (10.19) then becomes 


me(A*) = mon,C = ie = M3n,05 A?) (10.56) 


and there will be similar relations for the A and B masses. As noted previously, 
after (10.19), the shift represented by (10.56) is in an O(g?) perturbative 
correction (because nË! contains a factor g°), so that — again in the spirit 
of systematic perturbation theory — it will be adequate to this order in g? to 
replace the Lagrangian masses m4, m%, and me inside the expressions for 
ne, me! and n by their physical counterparts. In this way the relations 
(10.56) and the two similar ones give us the prescription for rewriting the m? 
in terms of the mend and A?. Of course, when this is done in the propagators, 
the result is just to produce the desired form ~(q* — m2, ;)~*, to this order. 

So, for the propagator at this one-loop order, the effect of such mass shifts 
is essentially trivial: the large A behaviour is simply absorbed into m?. What 
about Zc? This was defined via (10.28) in terms of the quantity 


am) 
dq? 


(10.57) 


2—2 
7 =M h, C 


318 10. Loops and Renormalization I: The ABC Theory 


However, equation (10.55) shows that the divergent part of rates is independent 
of q?, or equivalently that the quantity (10.57) is finite. It follows that Zc is 
finite in this theory. In other theories, quantities analogous to (10.55) might 
contain a q?-dependent divergence, which would be formally absorbed in the 
rescaling represented by Zc. 

We may also analyse the vertex correction GP! of figure 10.6, and conclude 
that it too is finite, because there are now three propagators giving six powers 
of k in the denominator, with still only a four-dimensional d*k integration. 
Once again, the analogous vertex correction in QED is divergent, as we shall 
see in chapter 11; there too this divergence can be absorbed into a redefinition 
of the physical charge. The ABC theory is, in fact, a ‘super-renormalizable’ 
one, meaning (loosely) that it has fewer divergences than might be expected. 
We shall come back to the classification of theories (renormalizable, non- 
renormalizable and super-renormalizable) at the end of the following chapter. 

While it is not our purpose to present a full discussion of one-loop renor- 
malization in the ABC theory (because it is not of any direct physical interest) 
we will use it to introduce one more important idea before turning, in the next 
chapter, to one-loop QED. 


E 
10.4 Bare and renormalized perturbation theory 
10.4.1 Reorganizing perturbation theory 


We have seen that, of the one-loop effects listed at the end of section 10.2, the 
mass shifts given by equations such as (10.14) do involve formal divergences 
as A — oo, but the vertex correction and field strength renormalization are 
finite in the ABC theory. We shall find that in QED the corresponding quan- 
tities are all divergent, so that the perturbative replacement of all Lagrangian 
parameters by their ‘physical’ counterparts, together with field strength renor- 
malizations, is mandatory in QED in order to get rid of In A terms. However, 
this process — of evaluating the connections between the two sets of param- 
eters, and then inserting them into all the calculated amplitudes — is likely 
to be very cumbersome. In this section, we shall introduce an alternative 
formulation, which has both calculational and conceptual advantages. 

By way of motivation, consider the QED analogue of the divergent part 
of equation (10.7), which contributes a correction to the bare electron mass 
of the form amIn(A/m) where m is the electron mass. At A = 100 GeV the 
magnitude of this is about 0.04 MeV (if we take m to have the physical value), 
which is a shift of some 10%. The application of perturbation theory would 
seem more plausible if this kind of correction were to be included from the 
start, so that the ‘free’ part of the Hamiltonian (or Lagrangian) involved the 
physical fields and parameters, rather than the (unobserved) ones appearing 
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in the original theory. Then the main effects, in some sense, would already be 
included by the use of these (empirical) physical quantities, and corrections 
would be ‘more plausibly’ small. This is indeed the main reason for the useful- 
ness of such ‘effective’ parameters in the analogous case of condensed matter 
physics. Actually, of course, in quantum field theory the corrections will be 
just as infinite (if we send A to infinity) in this approach also, since whichever 
way we set the calculation up, we shall get loops, which are divergent. All the 
same, this kind of ‘reorganization’ does offer a more systematic approach to 
renormalization. 
To illustrate the idea, consider again our ABC Lagrangian 


L= Lo,a + Lop + Loc + Lint (10.58) 
where 
Loc = 33b" po — 3MEGE (10.59) 


and similarly for Le A; Lo.B; and where 


Lint = -gba nc. (10.60) 

There are two obvious moves to make: (i) introduce the rescaled (renor- 
malized) fields by 

dpnale) = Z7 hila) (10.61) 


in order to get rid of the WZ; factors in the S-matrix elements; and (ii) 
introduce the physical masses mea i Consider first the non-interacting parts 


of £, namely : i : : 
Lo = Loa + Lon + Loc. (10.62) 


Singling out the C-parameters for definiteness, Lo can then be written as 


Lo = LZ cOubpn,cO" bpn,c = im? Zonc aes 
= $9 Opb,cO" Oph, -7 imn, cÊph,c 
+4(Zc — 1)ð pn, c" ph, — E(meZc — ma o)@ no +--+ (10.63) 
Lopn,c + {$6ZcOndpn,cO" bph,c 
— $(6Zomy o + 6meZo)on of +e (10.64) 


where Loph,c is the standard free-C Lagrangian in terms of the physical field 
and mass, which leads to a Feynman propagator i/(k? — m2), g + ie) in the 
usual way; also, ôZc = Zc — 1 and ôm& = mê — m2), c- In (10.64) the dots 
signify similar rearrangements of Êo, A and Lop. Note that Zc and me are 
understood to depend on A, as usual, although this has not been indicated 
explicitly. 

We now regard OF eat a + Loph,B + Loph,c’ as the ‘unperturbed’ part of Ê. 
and all the remainder of (10.64) as perturbations additional to the original Lo 
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FIGURE 10.10 


Counter term corresponding to the terms in braces in (10.64). 


(much of theoretical physics consists of exploiting the identity ‘a+b = (a+c)+ 
(b—c)’). The effect of this rearrangement is to introduce new perturbations, 
namely 45ZcOudpn,cO" bpn,c and the Cae term in (10.64), together with 
similar terms for the A and B fields. Such additional perturbations are called 
‘counter terms’ and they must be included in our new perturbation theory 
based on the Loph,i pieces. As usual, this is conveniently implemented in 
terms of associated Feynman diagrams. Since both of these counter terms 
involve just the square of the field, it should be clear that they only have 
non-zero matrix elements between one-particle states, so that the associated 
diagram has the form shown in figure 10.10, which includes both these C- 
contributions. Problem 10.5 shows that the Feynman rule for figure 10.10 
is that it contributes i[)Zck? — (6Zom*, c + 6m%Zc)] to the 1 C > 1 C 
amplitude. 

The original interaction term aoe may also be rewritten in terms of the 
physical fields and a physical (renormalized) coupling constant gpn: 


-gasc = —9(ZaZ_uZo)/? bpn,adph.Boph,C 
= —GpnPpn,aPph,BPpn,c — (Zv — 1)gpndpn,adph,BOph,C 
(10.65) 
where 
Zygon = (Za Ze Zo). (10.66) 


The interpretation of (10.66) is clearly that ‘gph’ is the coupling constant 
describing the interactions among the Êph,i fields, while the ‘(Zy — 1)’ term 
is another counter term, having the structure shown in figure 10.11. 

In summary, we have reorganized £ so as to base perturbation theory 
on a part describing the free renormalized fields (rather than the fields in 
the original Lagrangian); in this formulation we find that, in addition to the 
(renormalized) ABC-interaction term, further terms have appeared which are 
interpreted as additional perturbations, called counter terms. These counter 
terms are determined, at each order in this (renormalized) perturbation the- 
ory, by what are basically self-consistency conditions — such as, for example, 
the requirement that the propagators really do reduce to the physical ones 
at the ‘mass-shell’ points. We shall now illustrate this procedure for the C 
propagator. 
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FIGURE 10.11 
Counter term corresponding to the ‘(Zy — 1)’ term in (10.66). 


10.4.2 The O(g92n) renormalized self-energy revisited: how 
counter terms are determined by renormalization con- 
ditions 


Let us return to the calculation of the C propagator, following the same pro- 
cedure as in section 10.1, but this time ‘perturbing’ away from Loph,i and 
including the contribution from the counter term of figure 10.10, in addition 
to the O(gn) self energy. The expression (10.14) will now be replaced by 


i 


ae ae TT a en ee eae ee 
q“ — Mpn,o T4 ee” C™ph,o T CMCAC = ph, oT > 
where 
-iË o(a, A?) = (—ig ye — 
ph,c\@ > P (Q7)4 k? — mona +ie (q—k)?- Mn B + ie 
(10.68) 


and where we have indicated the cut-off dependence on the left-hand side, 
leaving it understood on the right. Comparing (10.68) with (10.39) we see 
that they are exactly the same, except that My involves the ‘physical’ cou- 
pling constant gph and the physical masses, as expected in this renormalized 
perturbation theory. In particular, he will be divergent in exactly the same 
way as 2, as the cut-off A goes to infinity. 

The essence of this ‘reorganized’ perturbation theory is that we now de- 
termine ôZc and ôm& from the condition that as q? — m2), c, the propagator 
(10.67) reduces to i/(q? — mpc), i.e. it correctly represents the physical C 
propagator at the mass-shell point, with standard normalization. Expanding 
Hy ata?) about q? = m2), « then, we reach the approximate form of (10.67), 
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valid for q? ~ m2), c: 


i 


an?! 
2 h,C 
(a? = my ¢)Zco—6m3 Zo} omina A?)—(¢? = mono) aga —_— 
q =M ph,C 
(10.69) 
Requiring that this has the form i/(q° — m2), g) gives 
condition (a) me = —Z5 Te (mac: A) 
2 
condition (b)  Zo=1+ Ta Rp (10.70) 
P=Msn.0 


Looking first at condition (b), we see that our renormalization constant Zc 
has, in this approach, been determined up to O(gn) by an equation that is, in 
fact, very similar to (10.28), but it is expressed in terms of physical parameters. 
As regards (a), since Zc = 1+ O(Gn)s it is sufficient to replace it by 1 on 
the right-hand side of (a), so that, to this order, 6m2 ~ -0 oma an 
Once again, this is similar to (10.56), but written in terms of the physical 
quantities from the outset. We indicate that these evaluations of Zc and mè 
are correct to second order by adding a superscript, as in Ze l 

Of course, we have not avoided the infinities (in the limit A — oo) in this 
approach! It is still true that the loop integral in We diverges logarithmi- 


cally and so the mass shift (m?l)? is infinite as A —> oo. Nevertheless, this 
is a conceptually cleaner way to do the business. It is called ‘renormalized 
perturbation theory’, as opposed to our first approach which is called ‘bare 
perturbation theory’. What we there called the ‘Lagrangian fields and pa- 
rameters’ are usually called the ‘bare’ ones; the ‘renormalized’ quantities are 
‘clothed’ by the interactions. 

We may now return to our propagator (10.67), and insert the results 
(10.70) to obtain the final important expression for the C propagator con- 
taining the one-loop O(gen) renormalized self-energy: 


(10.71) 
where 
[2] 
—[2] 2 2 dll h,C 
Thyn.c(9) = Mph, o(9, A?) — pa, o mpn, A?) = (= mine) GE — 
PHM ph,C 


(10.72) 
We remind the reader that Me (q?, A?) has exactly the same form as n? (@, A?) 
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except that g? and m? 


follows that, as A — oo, 


are replaced by gô, and m From (10.55) it then 


2 
phi’ 


g2 g2 g? 1 
TEM A = -i — rus 1)+ ae f da In A(z,q°), (10.73) 
and hence 
[2] 7,2 42 [2] 2 2 T A(z, q”) 
Mhel AY) Hinc Mpc AY) = 2 | coin} geet (10.74) 


which is finite as A —> oo. It is also clear from (10.73) that am ajdo is 


finite as A — oo. Thus the quantity TA olè) is finite as A — oo, and 


is understood to be evaluated in that limit; the subtraction in (10.74) has 
removed the infinity. The additional subtraction in (10.72) would in fact 
have removed a logarithmic divergence in Zc, had there been one. Note that 


the form of (10.72) guarantees that the leading behaviour of TH o(a?) near 
q? = mac is (9° — m*, qc)’, so that the behaviour of (10.71) near the mass- 
shell point is indeed i/(q?° — mĝn,cœ) as desired. 

A succinct way of summarizing our final renormalized result (10.71), with 
the definition (10.72), is to say that the C propagator may be defined by 


(10.71) where the O(g3,) renormalized self-energy Dg satisfies the renor- 
malization conditions 

72] d —(2] 
pno l0? = minc) =0 qg hcla’) =0. (10.75) 


Diss and 
G=Mn,c 


Relations analogous to (10.75) clearly hold for the A and B self-energies also. 
In this definition, the explicit introduction and cancellation of large-A terms 
has disappeared from sight, and all that remains is the importation of one 
constant from experiment, mn or and a (hidden) rescaling of the fields. It is 
useful to bear this viewpoint in mind when considering more general theories, 
including ones that are ‘non-renormalizable’ (see section 11.8 of the following 
chapter). 

There is a lot of good physics in the expression (10.71), which we shall elu- 
cidate in the realistic case of QED in the next chapter. For the moment, we 
just whet the reader’s appetite by pointing out that (10.71) must amount to 
the prediction of a finite, calculable correction to the Yukawa 1 — C exchange 
potential, which after all is given by the Fourier transform of the (static form 
of) the propagator, as we learned long ago. In the case of QED, this will 
amount to a calculable correction to Coulomb’s law, due to radiative correc- 
tions, as we shall discuss in section 11.5.1. 

There is an important technical implication we may draw from (10.75). 
Consider the Feynman diagram of figure 10.12 in which a propagator correc- 
tion has been inserted in an external line. This diagram is of order Ion and 
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FIGURE 10.12 
O(g*) contribution to A + B — A + B, involving a propagator correction 
inserted in an external line. 


should presumably be included along with the others at this order. However, 


the conditions (10.75) — in this case written for m a ~ imply that it vanishes. 
Omitting irrelevant factors, the amplitude for figure 10.12 is 
=[2 1 1 


Daba 


D a a (10.76) 
PAT mena = mac 


and we need to take the limit på — Mô, a since the external A particle is 


on-shell. Expanding mied a about the point på = m2), 4 and using conditions 
(10.75) for C + A we see that (10.76) vanishes. Thus with this definition, 
propagator corrections do not need to be applied to external lines. 


E 


10.5 Renormalizability 


We have seen how divergences present in self-energy loops like figure 10.7(a) 
can be eliminated by supposing that the ‘bare’ masses in the original La- 
grangian depend on the cut-off in just such a way as to cancel the divergences, 
leaving a finite value for the physical masses. The latter are, however, param- 
eters to be taken from experiment: they are not calculable. Alternatively, we 
may rephrase perturbation theory in terms of renormalized quantities from the 
outset, in which case the loop divergence is cancelled by appropriate counter 
terms; but again the physical masses have to be taken from experiment. We 
pointed out that, in the ABC theory, neither the field strength renormaliza- 
tions Z; nor the vertex diagrams of figure 10.5 were divergent, but we shall see 
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(a) (b) 


FIGURE 10.13 
(a) O(g*) one-loop contribution to A + B —> A + B; (b) counter term that 
would be required if (a) were divergent. 


in the next chapter that the analogous quantities in QED are divergent. These 
divergences too can be absorbed into redefinitions of the ‘physical’ fields and 
a ‘physical’ coupling constant (the latter again to be taken from experiment). 
Or, again, such divergences can be cancelled by appropriate counter terms in 
the renormalized perturbation theory approach. 

In general, a theory will have various divergences at the one-loop level, 
and new divergences will enter as we go up in order of perturbation theory (or 
number of loops). Typically, therefore, quantum field theories betray sensitiv- 
ity to unknown short-distance physics by the presence of formal divergences 
in loops, as a cut-off A > co. In a renormalizable theory, this sensitivity can 
be systematically removed by accepting that a finite number of parameters 
are uncalculable, and must be taken from experiment. These are the suitably 
defined ‘physical’ values of the masses and coupling constants appearing in 
the Lagrangian. Once these parameters are given, all other quantities are 
finite and calculable, to any desired order in perturbation theory — assuming, 
of course, that terms in successive orders diminish sensibly in size. 

Alternatively, we may say that a renormalizable theory is one in which a 
finite number of counter terms can be so chosen as to cancel all divergences 
order by order in renormalized perturbation theory. Note, now, that the only 
available counter terms are the ones which arise in the process of ‘reorganizing’ 
the original theory in terms of renormalized quantities plus extra bits (the 
counter terms). All the counter terms must correspond to masses, interactions, 
etc which are present in the original (or ‘bare’) Lagrangian — which is, in fact, 
the theory we are trying to make sense of! We are not allowed to add in any 
old kind of counter term — if we did, we would be redefining the theory. 

We can illustrate this point by considering, for example, a one-loop (O(g*)) 
contribution to AB > AB scattering, as shown in figure 10.13(a). If this graph 
is divergent, we will need a counter term with the structure shown in fig- 
ure 10.13(b) to cancel the divergence — but there is no such ‘contact’ AB > AB 
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interaction in the original theory (it would have the form A¢3 (x)¢%,()). In 
fact, the graph is convergent, as indicated by the usual power-counting (four 
powers of & in the numerator, eight in the denominator from the four propa- 
gators). And indeed, the ABC theory is renormalizable — or rather, as noted 
earlier, ‘super-renormalizable’. 

We shall have something more to say about renormalizability and non- 
renormalizability (is it fatal?), at the end of the following chapter. The first 
and main business, however, will be to apply what we have learned here to 
QED. 


-E 


Problems 


10.1 Carry out the indicated change of variables so as to obtain (10.4) from 
(10.3). 


10.2 Verify the Feynman identity (10.40). 
10.3 Obtain (10.42) from (10.41). 


10.4 Obtain (10.51) from (10.50), having replaced the upper limit of the u- 
integral by A. 


10.5 Obtain the Feynman rule quoted in the text for the sum of the counter 
terms appearing in (10.64). 


11 


Loops and Renormalization II: QED 


The present electrodynamics is certainly incomplete, but is no longer cer- 
tainly incorrect. 


—F. J. Dyson (1949b) 


We now turn to the analysis of loop corrections in QED. As we might expect, 
a theory with fermionic and gauge fields proves to be a tougher opponent than 
one with only spinless particles, even though we restrict ourselves to one-loop 
diagrams only. 

At the outset we must make one important disclaimer. In QED many 
loop diagrams diverge not only as the loop momentum goes to infinity (‘ul- 
traviolet divergence’) but also as it goes to zero (‘infrared divergence’). This 
phenomenon can only arise when there are massless particles in the theory — 
for otherwise the propagator factors ~(k? — M?)~1 will always prevent any 
infinity at low k. Of course, in a gauge theory we do have just such mass- 
less quanta. Our main purpose here is to demonstrate how the ultraviolet 
divergences can be tamed and we must refer the reader to Weinberg (1995, 
chapter 13), or to Peskin and Schroeder (1995, section 6.5), for instruction in 
dealing with the infrared problem. The remedy lies, essentially, in a careful 
consideration of the contribution, to physical cross sections, of amplitudes in- 
volving the real emission of very low frequency photons, along with infrared 
divergent virtual photon processes. It is a ‘technical’ problem, having to do 
with massless particles (of which there are not that many), whereas ultraviolet 
divergences are generic. 


E 


11.1 Counter terms 


We shall consider the simplest case of a single fermion of bare mass mo and 
bare charge eo (eo > 0) interacting with the Maxwell field, for which the bare 
(i.e. actual!) Lagrangian is 


1 
4 


1 


Ĉ = wo (id oe mo)wvo — eoýoy boon = Fon ky” = Ep (0 ` Ao)? (11.1) 
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=- A PSDKADIS 
(a) (b) (c) 
FIGURE 11.1 


Counter terms in QED: (a) electron mass and wavefunction; (b) photon wave- 
function; (c) vertex part. 


according to chapter 7. We shall adopt the ‘renormalized perturbation theory’ 
approach and begin by introducing field strength renormalizations via 


bp = Zy bo (11.2) 
Â! = Be (11.3) 


where the ‘physical’ fields and parameters will now simply have no ‘0’ sub- 
script. This will lead to a rewriting of the free and gauge-fixing part of (11.1): 


= A 1a wi 1 

Volid — mo)vo — 7 Ou Eo ~ 3g, Ao)? 
Law a 1 A 
= HV 2 
giwt TA A) 


[(Z2 - Lid — dmb] — (2-1) Fu Ê” (11.4) 


where € = 9/Z3 and 6m = moZ2 — m (compare (10.64)). We see the emer- 


gence of the expected ‘y)...w’ and ‘Ê - F’ counter terms in (11.4), affecting 
both the fermion and the gauge-field propagators. Next, we write the in- 
teraction in terms of a physical e, and the physical fields, together with a 
compensating third counter term: 


-eoo bo Aon = -ey A, oe ee LedybA, (11.5) 
where, with the aid of (11.2) and (11.3), 
Zie = e0222”. (11.6) 


The three counter terms are represented diagrammatically as shown in fig- 
ures 11.1(a), (b) and (c), for which the Feynman rules are, respectively, 


(a): i[f(Z2 — 1) — dm] 
(b): —i(g"”k? — k#k’)(Z3 — 1) (11.7) 
(e): —ieọ”(Zı — 1). 
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Pao 


(a) 


(b) 


FIGURE 11.2 
Elementary one-loop divergent diagrams in QED. 


These counter terms will compensate for the ultraviolet divergences of the 
three elementary loop diagrams of figure 11.2, and in fact they are sufficient 
to eliminate all such divergences in all QED loops. 

Before proceeding further we remark that we already have a first indication 
that renormalizing a gauge theory presents some new features. Consider the 
two counter terms involving Zə — 1 and Z; — 1; their sum gives 


VllZ2 — 1)8 — e(Zı — 1) Alb (11.8) 


which is not of the ‘gauge principle’ form ‘ʻi — eA’! Unless, of course, Z, = 
Z2. This relation between the two quite different renormalization constants 
is, in fact, true to all orders in perturbation theory, as a consequence of a 
Ward identity (Ward 1950), which is itself a consequence of gauge invariance. 
We shall discuss the Ward identity and Z, = Zə at the one loop level in 
section 11.6. 


11.2 The O(e?) fermion self-energy 


In analogy with =a the amplitude corresponding to figure 11.2(a) is the 
fermion self-energy —id!?! where 


i . d*k 
isPl(p) = (—i "J aa a 11.9 
in" (p) = (~ie) | 1° 3 poFom’ Gaf (11.9) 
and we have now chosen the gauge € = 1. As expected, the d*k integral 
in (11.9) diverges for large k — this time more seriously than the integral in 
2), because there are only three powers of k in the denominator of (11.9) 
as opposed to four in (10.7). Once again, we need to choose some form of 
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regularization to make (11.9) ultraviolet finite. We shall not be specific (as 
yet) about what choice we are making, since whatever it may be the outcome 
will be qualitatively similar to the nË! case. 

There is, however, one interesting new feature in this (fermion) case. As 
previously indicated, power-counting in the integral of (11.9) might lead us to 
expect that — if we adopt a simple cut-off — the leading ultraviolet divergence 
of P] would be proportional to A rather than InA. This is because we 
have that one extra power of k in the numerator and XP] has dimensions 
of mass. However, this is not so. The leading p-independent divergence is, 
in fact, proportional to mIn(A/m). The reason for this is important and 
it has interesting generalizations. Suppose that m in (11.4) were set equal 
to zero. Then, as we saw in problem 9.4, the two helicity components Or 
and wR of the electron field will not be coupled by the QED interaction. 
It follows that no terms of the form PLUR or pvr can be generated, and 
hence no perturbatively induced mass term, if m = 0. The perturbative mass 
shift must be proportional to m and therefore, on dimensional grounds, only 
logarithmically divergent. 

There is also a p-dependent divergence of the self-energy, of which warning 
was given in section 10.3.2. As in the scalar case, this will be associated with 
the field strength renormalization factor Z2. It is proportional to pln(A/m) 
(Zz is the coefficient of J in (11.8), which leads to p in momentum space). The 
upshot is that the fermion propagator, including the one-loop renormalized 
self-energy, is given by 

pom SEG) (11.10) 


where (cf (10.74)) 


axl] 
dp p=m 


Z2 (p) = DP(p) - DPI = m) — (p — m) 


(11.11) 


Whatever form of regularization is used, the twice-subtracted ÈP] will be 
finite and independent of the regulator when it is removed. In terms of the 
‘compensating’ quantities Z2 and mo — m, we find (problem 11.1, cf (10.70)) 


dy? 
dp 


Z=1 mo — m = —Zz1XPl(p = m). (11.12) 


pom 


Note that, as in the case of me), the definition (11.11) of XP] implies that 
propagator corrections vanish for external (on-shell) fermions. The quantities 
Zə and mo determined by (11.12) now carry a superscript ‘[2]’ to indicate that 
they are correct at O(e?). 

We must now remind the as that, although we have indeed eliminated 
the ultraviolet divergences in ÑP] by the subtractions of (11.11), there remains 
an untreated infrared divergence in dd)?! / dp. To show how this is dealt with 
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would take us beyond our intended scope, as explained at the start of the 
chapter. Suffice it to say that by the introduction of a ‘regulating’ photon 
mass ju”, and consideration of relevant real photon processes along with virtual 
ones, these infrared problems can be controlled (Weinberg 1995, Peskin and 
Schroeder 1995). 


11.3 The O(e?) photon self-energy 
The amplitude corresponding to figure 11.2(b) is im 2! (q) where 


i i i 


= -e f Ce Eh ed 
(Qn? (qth? mk m] ` 


(11.14) 


Once again, this photon self-energy is analogous to the scalar particle self- 
energy of chapter 10. There are two new features to be commented on in 
(11.14). The first is the overall ‘—1’ factor, which occurs whenever there is a 
closed fermion loop. The keen reader may like to pursue this via problem 11.2. 
The second feature is the appearance of the trace symbol ‘Tr’: this is plausible 
as the amplitude is basically a ly —> 1y one with no spinor indices, but again 
the reader can follow that through in problem 11.3. 

We now want to go some way into the calculation of ul, because it will, 
in the end, contain important physics — for example, corrections to Coulomb’s 
law. The first step is to evaluate the numerator trace factor using the theorems 
of section 8.2.3. We find (problem 11.4) 


Trl(¢gt+k+m)yk+ mw) = Hlan + ku)ky + (qv + hv) hy 
— guv((q-k) +k? —m?)}. (11.15) 


We then use the Feynman identity (10.40) to combine the denominators, yield- 
ing 

1 4 1 
B = dr = 
AFE mk m] J VE- A, Fi 


where k’ = k+ zq, A, = —x(1—2)q?+m? (note that A, is precisely the same 
as A of (10.43) with m4 = mg = m) and we have reinstated the implied ‘ie’. 
Making the shift to the variable k’ in the numerator factor (11.15) produces 
a revised numerator which is 


(11.16) 


4{ 2k! k, — Qui (k’?—A,)—22(1—2)(quqv—Guvg”)+terms linear in k'} (11.17) 


where the terms linear in k’ will vanish by symmetry when integrated over k’ 
in (11.14). Our result so far is therefore 
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1 AL 1 Zt 
d*k 2k k 
2 (¢2 = -4 f a lo r 
1 uola”) € a £ (27) | (k2 = A, +i)? (k2 — A, tie) 
ad a(l—-—« 
+ 8e? (qudv — Iw ) fa =| om CE A; orto . (11.18) 


Consider now the ultraviolet divergences of (11.18), adopting a simple 
cut-off as a regularization. The terms in the first line are both apparently 
quadratically divergent, while the integral in the second line is logarithmically 
divergent. What counter terms do we have to cancel these divergences? The 
answer is that the ‘(Z3—1)’ counter term of figure 11.1(b) is of exactly the right 
form to cancel the logarithmic divergence in the second line of (11.18), but 
we have no counter term proportional to the guy term in the first line. Note, 
incidentally, that we can argue from Lorentz covariance (see appendix D) that 


d*k’ kik, 
o ww RANo 11.1 
f (27)t (k’? — A, + ie)? (Ay) 9 on!) 


so that taking the dot product of both sides with g"” we deduce that 


| dtk’ 2k, 1 J dék’ Ke guy ne 
Or) (k2 —A, fie? 2) Qr) (kh? —A, Fie? l 


It follows that both the terms in the first line of (11.18) produce a divergence 
of the form ~A*g,,,, and they do not cancel, at least in our simple cut-off 
regularization. 

A term proportional to g,,, is, in fact, a photon mass term. A Lagrangian 
mass term for the photon would have the form Im? Guv Ab Ae , which af- 
ter introducing the rescaled Â, will generate a counter term proportional to 
Juv Â! Â”, and an associated Feynman amplitude proportional to g,,. But 
such a term m2 violates gauge invariance! (It is plainly not invariant un- 
der (7.69).) Evidently the simple momentum cut-off that we have adopted 
as a regularization procedure does not respect gauge invariance. We saw in 
section 8.6.2 that gauge invariance implied the condition 


gT, =0 11.21 
H 


where q is the 4-momentum of a photon entering a one-photon amplitude T),. 
Our discussion of (11.21) was limited in section 8.6.2 to the case of a real 
external photon, whereas the photon lines in im?) | are internal and virtual; 
nevertheless it is still true that gauge invariance implies (Peskin and Schroeder 
1995, section 7.4) 

fig = gui = o. (11.22) 


Condition (11.22) is guaranteed by the tensor structure (ququ — guv?) of the 
second line in (11.18), provided the divergence is regularized. As previously 
implied, a simple cut-off A suffices for this term, since it does not alter the 
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tensor structure, and the A-dependence can be compensated by the ‘Z3 — 1’ 
counter term which has the same tensor structure (cf figure 11.2(b)). But 
what about the first line of (11.18)? Various gauge-invariant regularizations 
have been used, the effect of all of which is to cause the first line of (11.18) to 
vanish. The most widely used, since the 1970s, is the dimensional regulariza- 
tion technique introduced by ’t Hooft and Veltman (1972), which involves the 
‘continuation’ of the number of space-time dimensions from four to d (< 4). 
As d is reduced, the integrals tend to diverge less, and the divergences can be 
isolated via the terms which diverge as d — 4. Using gauge-invariant dimen- 
sional regularization, the two terms in the first line of (11.18) are found to 
cancel each other exactly, leaving just the manifestly gauge invariant second 
line (see appendix O of volume 2). 

mile proceed to the next step, renormalizing the gauge-invariant part of 
iT (q). 


11.4 The O(e?) renormalized photon self-energy 


The surviving (gauge-invariant) term of nP is 


sa (1-2) 
8e?(qudu „d 11.23 
*(qudv — 9p yf =| om Te? A, +i? ( ) 


(7? guv — aut) OP! (0P). (11.24) 


The d*k’ integral in (11.23) is exactly the same as the one in (10.42), with A 
replaced by A,. It contains a logarithmic divergence, which we regulate as 
before by a simple cut-off A, so that we are dealing with the gauge-invariant 
quantity TI?! (q?, A2). The calculation leading to (10.55) then tells us that, as 
A> oo, 


iT?) (q”) 


E 1 
P(g, A?) = -5f da fma + (In2—1)— sina, } (11.25) 
The analogue of (10.11) is then (in the gauge € = 1) 


—i —igup . =i 
Juv + Jup -ilq g” qq) (@, A?) _ Mov 


q? g’ q? 
+ he -ilag — q?q? TIP! (q?, A”) - aie 
-il g™ — g7q") OP, A?) - oe + 
z He + ee per CHOE: ee pepr cae (@?, A?) + 


(11.26) 
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where P 
qq 
Pe = 9 


and 
9, = OF 


(i.e. the 4x 4 unit matrix). It is easy to check (problem 10.5) that PP P7 = PP. 
Hence the series (11.26) becomes 


y PO a Oey e 
19 pv —ig 42 A2 2] 2 N2)\2 ig 
=a ta PEL + NPM) + Pa, A?) +] + PP 
_ —i(guv — Id / Ê) i (4) 


Pa- Pe 


(11.27) 


after summing the geometric series, exactly as in (10.11)—(10.14). 
But we have forgotten the counter term of figure 11.1(b), which contributes 
an amplitude —i(g"”q? — q“q’)(Z3 — 1). This has the effect of replacing u 


in (11.27) by m2! — (Z3 — 1) and we arrive at the form 


Agu — Iq / P) Luly 


(Z —U}(q2,A2)) P P 


(11.28) 


Now in any S-matrix element, at least one end of this corrected propagator 
will connect to an external charged particle line via a vertex of the form 
jt (p, p’) (cf (8.98) and (8.99) for example), as in figure 11.3. But, as we have 
seen in (8.100), current conservation implies 


quJk (p, p") = 0. (11.29) 


Hence the parts of (11.28) with q,,q, factors will not contribute to physical 
scattering amplitudes, and our O(e?) corrected photon propagator effectively 
takes the simple form 
-igu 
= 
@ (Zs — I}! (q, A?)) 


We must now determine Z3 from the condition (just as for the C propagator) 
that (11.30) has the form —ig,,/q? as q? — 0 (the mass-shell condition). This 
gives 


(11.30) 


zP! = 1 + H (0, A?) (11.31) 
the superscript on Z3 indicating as usual that it is an O(e?) calculation as 
evidenced by the e? factor in (11.18). We note from equation (11.25) that 


me 1(0, A?) contains a InA part, so that this time the field renormalization 
constant Z3 diverges when the cut-off is removed. 
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FIGURE 11.3 
One-loop corrected photon propagator connected to a charged particle vertex. 


Inserting (11.31) into (11.30) we obtain the final important expression for 
the 7-propagator including the one-loop renormalized self-energy (cf (10.71)): 


Iw (11.32) 
e - (g) 
where 
[2] /,2 (42 A2 2 2 
TL (q?) = TP) (q?, A?) — TP! (0, A”). (11.33) 
Equation (11.25) then leads to the result 
TL} (q a=- f 1—z)1 11.34 
da a(1— x) In nn , (11.34) 
which was first given by Schwinger (1949a). This ‘once-subtracted’ II oP i 
finite as A —> œ, and tends to zero as q? > 0. 
The generalization of (11.32) to all orders will be given by 
Sii (11.35) 


PA — U (4°)) 
where IL, (q?) is the all-orders analogue of TUA in (11.32), and is similarly re- 
lated to the 1-y irreducible photon self-energy I, via the analogue of (11.24): 


ilash’) = il Iu = quq Uy (a°). (11.36) 


Because Ta and hence iiy; has no 1~y intermediate states, it is expected to 
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FIGURE 11.4 
The contribution of a massless particle to the photon self-energy. 


have no contribution of the form A?/q?. If such a contribution were present, 
(11.35) shows that it would result in a photon propagator having the form 


—i 
7B ee (11.37) 
which is, of course, that of a massive particle. Thus, provided no such con- 
tribution is present, the photon mass will remain zero through all radiative 
corrections. It is important to note, though, that gauge invariance is fully sat- 
isfied by the general form (11.36) relating II, to Il}; it does not prevent the 
occurrence of such an ‘A?/q?’ piece in Hy. Remarkably, therefore, it seems 
possible, after all, to have a massive photon while respecting gauge invari- 
ance! This loophole in the argument ‘gauge invariance implies m, = 0’ was 
first pointed out by Schwinger (1962). 

Such a 1/q? contribution in IL, must, of course, correspond to a mass- 
less single particle intermediate state, via a diagram of the form shown in 
figure 11.4. Thus if the theory contains a massless particle, not the photon 
(since 1~y states are omitted from Il») but coupling to it, the photon can 
acquire mass. This is one way of understanding the ‘Higgs mechanism’ for 
generating a mass for a gauge-field quantum while still respecting the gauge 
symmetry (Englert and Brout 1964, Higgs 1964, Guralnik et al. 1964). The 
massless particle involved is called a ‘Goldstone boson’. As we shall see in 
volume 2, just such a photon mass is generated in a superconductor, and a 
similar mechanism is invoked in the Standard Model to give masses to the 
W+ and Z° gauge bosons, which mediate the weak interactions. 


$$$ 
11.5 The physics of P(g?) 


We now consider some immediate physical consequences of the formulae (11.32) 
and (11.34). 


11.5.1 Modified Coulomb’s law 


In section 1.3.3 we saw how, in the static limit, a propagator of the form 
-g4 (q? + m?,)~! could be interpreted (via a Fourier transform) in terms of a 
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Yukawa potential 
on <7" 
Ar r 


where a = mj’ (in units h = c = 1). As my —> 0 we arrive at the Coulomb 
potential, associated with the propagator ~1/q? in the static (qo = 0) limit. 
It follows that the corrected propagator (11.32) must represent a correction 
to the 1/r Coulomb potential. 

To see what it is, we expand the denominator of (11.32) so as to write 
(11.32) a 
ane + Ti} (q?)) (11.38) 
which is in fact the perturbative O(a) correction to the propagator (we shall 
return to (11.32) in a moment). At low energies, and in the static limit, 
q? = —q’ will be small compared to the fermion (mass)? in (11.34), and we 
may expand the logarithm in powers of q?/m?, with the result that the static 
propagator becomes (problem 11.6) 


Iur (1 2g 2) 11.39 

dee (14 agen (11.39) 
Quy. a 1 

= $ 11.40 

qg? Figy 157 m? ( ) 


The Fourier transform of the first term in (11.40) is proportional to the familiar 
coulombic 1/r potential (see appendix G, for example), while the Fourier 
transform of the constant (q?-independent) second term is a 6-function: 


ues 
ed rT ~ ft 3 


When (11.40) is used in any scattering process between two charged particles, 
each charged particle vertex will carry a charge e (or —e) and so the total 
effective potential will be (in the attractive case) 


-{S4 ae i} (11.42) 


15m2 


The second term in (11.42) may be treated as a perturbation in hydrogenic 
atoms, taking m to be the electron mass. Application of first-order perturba- 
tion theory yields an energy shift 


Ez 


AE) = = [ww P(r Wn (r ) q? 


= — A (OE. (11.43) 
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Only s-state wavefunctions are non-vanishing at the origin, where they take 
the value (in hydrogen) 


Yn (0) = = (=) k (11.44) 


n 
where n is the principal quantum number. Hence for this case 


4am 


AEU =-->, 
"4 157n3 


(11.45) 
For example, in the 2s state the energy shift is —1.122 x 1077 eV. Although 
we did not discuss the Coulomb spectrum predicted by the Dirac equation 
in chapter 3, it turns out that the 2°81 and 2? Ps levels are degenerate if 
no radiative corrections (such as the previous one) are applied. In fact, the 
levels are found experimentally to be split apart by the famous ‘Lamb shift’, 
which amounts to AE /2rħ = 1058 MHz in frequency units. The shift we have 
calculated, for the 2s level, is —27.13 MHz in these units, so it is a small — but 
still perfectly measurable — contribution to the entire shift. This particular 
contribution was first calculated by Uehling (1935). 

While small in hydrogen and ordinary atoms, the ‘Uehling effect’ dom- 
inates the radiative corrections in muonic atoms, where the ‘m’ in (11.44) 
becomes the muon mass m,,. This means that the result (11.45) becomes 


4a? My \? 
= (=) Mu: 
157n3 \ m i 


Since the unperturbed energy levels are (in this case) proportional to m,, 
this represents a relative enhancement of ~(m,,/m)? ~ (210)?. This calcu- 
lation cannot be trusted in detail, however, as the muonic atom radius is 
itself ~1/210 times smaller than the electron radius in hydrogen, so that the 
approximation |q| ~ 1/r « m, which led to (11.42), is no longer accurate 
enough. Nevertheless the order of magnitude is correct. 


11.5.2 MRadiatively induced charge form factor 


This leads us to consider (11.38) more generally, without making the low q? 
expansion. In chapter 8 we learned how the static Coulomb potential became 
modified by a form factor F(q?) if the scattering centre was not point-like, 
and we also saw how the idea could be extended to covariant form factors 
for spin-0 and spin-4 particles. Referring to the case of e~ u~ scattering for 
definiteness (section 8.7), we may consider the effect of inserting (11.38) into 
(8.182). The result is 


Uk Yuuk {fa + neg) } Up! Ww Up: (11.46) 
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Referring now to the discussion of form factors for charged spin-4 particles in 
section 8.8, we can share the correction (11.46) equally between the e~ and 
the u` vertices and write 


üp Yuuk > Clin Yuuk (1 + HP! (q?))\/? = ety yue(l + $1071(q?)) (11.47) 


for the electron, and similarly for the muon. From (8.208) this means that our 
‘radiative correction’ has generated some effective extension of the charge, as 
given by a charge form factor Fı(q°) = 1+ 17? (q”). Note that the condition 
F\(0) = 1 is satisfied since rm?! (0) =0. 

In the static case, or for scattering of equal mass particles in the CM 
system, we have q? = —q? and we may consider the Fourier transform of 
the function Fı(—q?), to obtain the charge distribution. The integral is dis- 
cussed in Weinberg (1995, section 10.2) and in Peskin and Schroeder (1995, 
section 7.5). The latter authors show that the approximate radial distribu- 
tion of charge is ~e~?”"" /(mr)°/?, indicating that it has a range ~z}. This is 
precisely the mass of the fermion—antifermion intermediate state in the loop 
which yields ne l so this result represents a plausible qualitative extension of 
Yukawa’s relationship (1.20) to the case of two-particle exchange. In any case, 
the range represented by me? l is of order of the fermion Compton wavelength 
1/m, which is an important insight; this is why we need to do better than the 
point-like approximation (11.42) in the case of muonic atoms. 


11.5.3 The running coupling constant 


There is yet another way of interpreting (11.38). Referring to (11.46), we may 
regard E 
e(a) = e + Ieg) (11.48) 


as a ‘q?-dependent effective charge’. In fact, it is usually written as a ‘q?- 
dependent fine structure constant’ 


al) = a[1 + Ti?! (ê). (11.49) 


The concept of a q?-dependent charge may be startling but the related one of 
a spatially dependent charge is, in fact, familiar from the theory of dielectrics. 
Consider a test charge q in a polarizable dielectric medium, such as water. 
If we introduce another test charge —q into the medium, the electric field 
between the two test charges will line up the water molecules (which have a 
permanent electric dipole moment) as shown in figure 11.5. There will be an 
induced dipole moment P per unit volume, and the effect of P on the resultant 
field is (from elementary electrostatics) the same as that produced by a volume 
charge equal to — div P. If, as is usual, P is taken to be proportional to E, 
so that P = xeo E, Gauss’ law will be modified from 


div E = ptree/€0 (11.50) 
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FIGURE 11.5 
Screening of charge in a dipolar medium (from Aitchison 1985). 


to 
div E = (pfree — div P)/€o = Ptree/€o — div(yE) (11.51) 


where pfree refers to the test charges introduced into the dielectric. If x is 
slowly varying as compared to E, it may be taken as approximately constant 
in (11.51), which may then be written as 


div E = ptree/€ (11.52) 


where e = (1+ y)eéo is the dielectric constant of the medium, €o being that of 
the vacuum. Thus the field is effectively reduced by the factor (1+-y)~ = €o/e. 

This is all familiar ground. Note, however, that this treatment is essentially 
macroscopic, the molecules being replaced by a continuous distribution of 
charge density — div P. When the distance between the two test charges 
is as small as, roughly, the molecular diameter, this reduction — or screening 
effect — must cease and the field between them has the full unscreened value. 
In general, the electrostatic potential between two test charges qı and q2 ina 
dielectric can be represented phenomenologically by 


V(r) = qq2/4re(r)r (11.53) 


where e(r) is assumed to vary slowly from the value e for r >> d to the value €o 
for r < d, where d is the diameter of the polarized molecules. The situation 
may be described in terms of an effective charge 


d = q/le(r)] (11.54) 


for each of the test charges. Thus we have an effective charge which depends 
on the interparticle separation, as shown in figure 11.6. 

Now consider the application of this idea to QED, replacing the polarizable 
medium by the vacuum. The important idea is that, in the vicinity of a test 
charge in vacuo, charged pairs can be created. Pairs of particles of mass m 


can exist for a time of the order of At ~ h/mc?. They can spread apart 
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FIGURE 11.6 
Effective (screened) charge versus separation between charges (from Aitchison 
1985). 


a distance of order cAt in this time, i.e. a distance of approximately h/mc, 
which is the Compton wavelength X.. This distance gives a measure of the 
‘molecular diameter’ we are talking about, since it is the polarized virtual 
pairs which now provide a vacuum screening effect around the original charged 
particle. The largest ‘diameter’ will be associated with the smallest mass m, 
in this case the electron mass. Not coincidentally, this estimate of the range 
of the ‘spreading’ of the charge ‘cloud’ is just what we found in section 11.5.2: 
namely, the fermion Compton wavelength. The longest-range part of the cloud 
will be that associated with the lightest charged fermion, the electron. 

In this analogy the bare vacuum (no virtual pairs) corresponds to the 
‘vacuum’ used in the previous macroscopic analysis and the physical vacuum 
(virtual pairs) to the polarizable dielectric. We cannot, of course, get outside 
the physical vacuum, so that we are really always dealing with effective charges 
that depend on r. What, then, do we mean by the familiar symbol e? This 
is simply the effective charge as r — 00 or q? — 0; or, in practice, the charge 
relevant for distances much larger than the particles’ Compton wavelength. 
This is how our q? — 0 definition is to be understood. 

Let us consider, then, how a(q?) varies when q? moves to large space-like 
values, such that —q? is much greater than m? (i.e. to distances well within 
the ‘cloud’). For |q?| > m? we find (problem 11.7) from (11.34) that 


nee) = [in (HD) -$ + 00m ie] (11.55) 


T 


so that our q?-dependent fine structure constant, to leading order in a is 


ale) Za h + = In (4) (11.56) 


for large values of |qg?|/m?, where A = exp5/3. 
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Equation (11.56) shows that the effective strength a(q?) tends to increase 
at large |q?| (short distances). This is, after all, physically reasonable: the 
reduction in the effective charge caused by the dielectric constant associated 
with the polarization of the vacuum disappears (the charge increases) as we 
pass inside some typical dipole length. In the present case, that length is m~! 
(in our standard units h = c = 1), the fermion Compton wavelength, a typical 
distance over which the fluctuating pairs extend. 

The foregoing is the reason why this whole phenomenon is called vacuum 
polarization, and why the original diagram which gave me? l is called a vacuum 
polarization diagram. 

Equation (11.56) is the lowest-order correction to a, in a form valid for 
\q?| >> m?. It turns out that, in this limit, the dominant vacuum polarization 
contributions (for a theory with one charged fermion) can be isolated in each 
order of perturbation theory and summed explicitly. The result of summing 
these ‘leading logarithms’ is 


Q 


M- (a/3m) In(Q2/Am?)] for Q >m (11.57) 


a(Q?) = 
where we now introduce Q? = —q?, a positive quantity when q is a momen- 
tum transfer. The justification for (11.57) — which of course amounts to the 
very plausible return to (11.32) instead of (11.38) — is subtle, and depends 
upon ideas grouped under the heading of the ‘renormalization group’. This 
is beyond the scope of the present volume, but will be taken up again in 
volume 2. 

Equation (11.57) presents some interesting features. First, note that for 
typical large Q? ~ (50 GeV)?, say, the change in the effective a predicted by 
(11.57) is quite measurable. Let us write 


a 


2) = —______ 11.58 
aQ?) = oH (11.58) 
in general, where Aa(Q?) includes the contributions from all charged fermions 
with mass m such that m? < Q?. The contribution from the charged leptons 


is then straightforward, being given by 


Aciesteas= z Y n(Q?/Am?) (11.59) 
l 


where my is the lepton mass. Including the e, u and 7 one finds (problem 11.8) 
Adieptons(Q” = (50 GeV)?) ~ 0.03. (11.60) 


However, the corresponding quark loop contributions are subject to strong 
interaction corrections, and are not straightforward to calculate. We shall not 
pursue this in detail here, noting just that the total contribution from the five 
quarks u, d, s, c and b has a value very similar to (11.60) for the leptons (see, 
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for example, Altarelli et al. 1989). Including both the leptonic and hadronic 
contributions then yields the estimate 


a(Q? = (50 GeV)?) = a x a x (11.61) 


The predicted increase of a(Q?) at large Q? has been tested by measuring 
the differential cross section for Bhabha scattering, 


eet — eet. (11.62) 


We are interested in the contribution from one-photon exchange in the t- 
channel, which will contain the factor a(Q?). To favour this contribution, 
the CM energy should be well beyond the Z° peak in the s-channel (cf figure 
9.16). This was the case at the highest LEP energy, v/s = 198 GeV, which also 
allowed large Q? values to be probed. The L3 experiment covered the region 
1800 GeV? < Q? < 21600 GeV? (Achard et al. 2005). These results, and 
earlier data from L3 (Acciari et al. 2000) and OPAL (Abbiendi et al. 2000), 
clearly show the expected rise in a(Q?) as Q? increases, and are in good 
quantitative agreement with the theoretical prediction of QED (Burkhardt 
and Pietrzyk 2001). 

The notion of a q?-dependent coupling constant is, in fact, quite general — 
for example, we could just as well interpret (10.71) in terms of a q?-dependent 
goala’). Such ‘varying constants’ are called running coupling constants. Until 
1973 it was generally believed that they would all behave in essentially the 
same way as (11.57) — namely, a logarithmic rise as Q? increases. Many people 
(in particular Landau 1955) noted that if equation (11.57) is taken at face value 
for arbitrarily large Q?, then a(Q?) itself will diverge at Q? = Am? exp(37/a). 
Taking m to be the mass of an electron, this is of course an absurdly high 
energy. Besides, as such energies are reached, approximations made in arriving 
at (11.57) will break down; all we can really say is that perturbation theory 
will fail as we approach such energies. 

While this may be an academic point in QED, it turns out that there is one 
part of the Standard Model where it may be relevant. This is the ‘Higgs sector’ 
involving a complex scalar field, as will be discussed in volume 2. In this case, 
the ‘running’ of the Higgs coupling constant can be invoked to suggest a useful 
upper bound on the Higgs mass (Maiani 1991). 

The significance of the 1973 date is that it was in that year that one 
of the most important discoveries in ‘post-QED’ quantum field theory was 
made, by Politzer (1973) and by Gross and Wilczek (1973). They performed 
a similar one-loop calculation in the more complicated case of QCD, which is 
a ‘non-Abelian gauge theory’ (as is the theory of the weak interactions in the 
electroweak theory). They found that the QCD analogue of (11.57) was 


2) _ Os (7) 
al) = TE MO) oe 


where f is the number of fermion—antifermion loops considered, and p is a 
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FIGURE 11.7 
Vacuum polarization insertion in the virtual one-photon annihilation ampli- 
tude in ete~ > pt pm. 


reference mass scale. The crucial difference from (11.57) is the large positive 
contribution ‘+33’, which is related to the contributions from the gluonic self- 
interactions (non-existent among photons). The quantity as(Q”) now tends 
to decrease at large Q? (provided f < 16), tending ultimately to zero. This 
property is called ‘asymptotic freedom’ and is highly relevant to understand- 
ing the success of the parton model of chapter 9, in which the quarks and 
gluons are taken to be essentially free at large values of Q?. This can be 
qualitatively understood in terms of as(Q?) — 0 for high momentum trans- 
fers (‘deep scattering’). The non-Abelian parts of the Standard Model will be 
considered in volume 2, where we shall return again to a,(Q?). 


11.5.4 m1?! in the s-channel 


We have still not exhausted the riches of me? l (q7). Hitherto we have con- 
centrated on regarding our corrected propagator as appearing in a t-channel 
exchange process, where q? < 0. But of course it could also perfectly well 
enter an s-channel process such as ete” —> ptu“ (see problem 8.18), as 
in figure 11.7. In this case, the 4-momentum carried by the photon is q = 
Det + Pe- = Put + Pu- 80 that q? is precisely the usual invariant variable 
‘s’ (cf section 6.3.3), which in turn is the square of the CM energy and is 
therefore positive. In fact, the process of figure 11.7 occurs physically only for 


q? = s > 4m?, where m, is the muon mass. 


Consider, therefore, our formula (11.34) for q? > 0, that is, in the time-like 
rather than the space-like (q? < 0) region. The crucial new point is that the 
argument [m? — q?a(1 — x)| of the logarithm can now become negative, so 
that ne l must develop an imaginary part. The smallest q? for which this can 
happen will correspond to the largest possible value of the product x(1— x), 
for 0 < x < 1. This value is $, and so me l becomes imaginary for q? > 4m?, 
which is the threshold for real creation of an ete~ pair. 

This is the first time that we have encountered an imaginary part in a 
Feynman amplitude which, for figure 11.7 and omitting all the spinor factors, 
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is once again 
1 
P (1 — Ty"(q?)) 

but now q? > 4m2, which is greater than 4m? so that OP (q4?) in (11.64) has 
an imaginary part. There is a good physical reason for this, which has to do 
with unitarity. This was introduced in section 6.2.2 in terms of the relation 
SSt = I for the S-matrix. The invariant amplitude M is related to S by 
Sg = 1 + i(2r)*6*(p; — pe) Mg (cf (6.102)). Inserting this into 9ST = I leads 
to an equation of the form (for help see Peskin and Schroeder (1995, section 
7.3)) 


2ImM¢ = 5 Mš; Mx (27)45 (v = 5 a) (11.65) 
k 


where ‘$`, stands for the phase space integral involving momenta qu, q,... 
over the states allowed by energy-momentum conservation. This implies that 
as the energy crosses each threshold for production of a newly allowed state, 
there will be a new contribution to the imaginary part of M. This is exactly 
what we are seeing here, at the ete~ threshold. 

It is interesting, incidentally, that (11.65) can be used to derive the rela- 
tivistic generalization of the optical theorem given in appendix H (note that 
the right-hand side of (11.65) is clearly related to the total cross section for 
i> k, ifi=f). 

As regards the real part of me (q?) in the time-like region, it will be given 
by (11.57) with Q? replaced by gq’, or s, for large values of q?. Again, mea- 
surements have verified the predicted variation of a(q?) in the time-like region 
(Miyabayashi et al. 1995, Ackerstaff et al. 1998, Abbiendi et al. 1999, 2000). 

There is one more ‘elementary’ loop that we must analyse — the vertex 
correction shown in figure 11.8, which we now discuss. We will see how the 
important relation Z, = Zə emerges, and introduce some of the physics con- 
tained in the renormalized vertex. 


11.6 The O(e?) vertex correction, and Zı = Z2 


The amplitude corresponding to figure 11.8 is 


, , zii vy 19v i 
“ieaoo uo) = a0!) fi E 
x (—ie : —iey* dtk u 


346 11. Loops and Renormalization I: QED 


FIGURE 11.8 
One-loop vertex correction. 


where Yu = Juo Y7, and rP represents the correction to the standard vertex 
and again € = 1. We find 


1 1 1 dtk 
Tl (p p’) = —i fs An yy A 11. 
L (p,p) 1e res y k m y i mA (r)i ( 67) 


The integral is logarithmically divergent at large k, by power counting, and 
the divergence will be cancelled by the Zı counter term of figure 11.1(c). It 
turns out to be infrared divergent also, as was dEP] /dp. As in the latter 
case, we leave the infrared problem aside, concentrating on the removal of 
ultraviolet divergences. 

Zı is determined by the requirement that the total amplitude at q = 
p—p’ = 0, for on-shell fermions, is just —ieu(p)y,,u(p), this being our definition 
of ʻe’. Hence we have (at O(e?)) 


—ieū(p) P (p, p)u(p) — ieūlp)y„ (Z? — 1)u(p) = 0 (11.68) 


and so 
2 
rB (p, p) +y (ZP — 1) = 0. (11.69) 


The renormalized vertex correction re?! may then be defined as 


T (p, p') = TP (p,p') + (ZP — Vy = TP, p’) -TPp,p) (11.70) 


and in this ‘once-subtracted’ form it is finite, and equal to zero at q = 0. 

We shall consider some physical consequences of re? in a moment, but 
first we show that (at O(e?)) zP = zP, and explain the significance of this 
important relation. It is, after all, at first sight a rather surprising equality 


between two apparently unrelated quantities, one associated with the fermion 
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self-energy, the other with the vertex part. From (11.9) we have, for the 
fermion self-energy, 


1 1 d*+k 
zp) = “it fe a oT (11.71) 


One can discern some kind of similarity between (11.71) and (11.67), which 
can be elucidated with the help of a little algebra. 
Consider differentiating the identity (p — m) (p — m)™+ = 1 with respect to 
p": 
joa = 
= ap | - m)(p =m) ] 


o =j 0 —1 
= | p= | p-m g-m- m) 


2 o n 
= Wp- m) +- m) 5 (p—m) r (11.72) 
It follows that 
ð = — = 
gpr PT = P- mT l — m) (11.73) 
from which the Ward identity (Ward 1950) follows immediately: 
ay] 
-2 = rh] Ka 
TAE (p, p" = p). (11.74) 


Derived here to one-loop order, the identity is, in fact, true to all orders, pro- 
vided that a gauge-invariant regularization is adopted. Note that the identity 
deals with rP at zero momentum transfer (q = p — p' = 0), which is the 
value at which e is defined. Note also that consistently with (11.74), each of 
axl / Op and rP are both infrared and ultraviolet divergent, though we shall 
only be concerned with the latter. 

The quantities XP] and r?! are both O(e2), and contain ultraviolet di- 
vergences which are cancelled by the O(e?) counter terms. From (11.11) and 
(11.12) we have 


DB = SPI Plm — m) + (p— m)(ZP! — 1) (11.75) 
where ÈL] is finite, and from (11.70) we have 
TB (p, p') = PB (p, p’) — (ZP! — 1) yy (11.76) 


where [Č is finite. Inserting (11.75) and (11.76) into (11.74) and equating 
the infinite parts gives 
zł = zł, (11.77) 
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This relation is true to all orders (Z; = Z2), provided a gauge-invariant 
regularization is used. It is a very significant relation, as already indicated 
after (11.8). It shows, first, that the gauge principle survives renormalization 
provided the regularization is gauge invariant. More physically, it tells us that 
the bare and renormalized charges are related simply by (cf (11.6)) 

e = eZ’. (11.78) 
In other words, the interaction-dependent rescaling of the bare charge is due 
solely to vacuum polarization effects in the photon propagator, which are 
the same for all charged particles interacting with the photon. By contrast, 
both Zı and Zə do depend on the specific type of the interacting charged 
particle, since these quantities involve the particle masses. The ratio of bare 
to renormalized charge is independent of particle type. Hence if a set of bare 
charges are all equal (or ‘universal’), the renormalized ones will be too. But 
we saw in section 2.6 how just such a notion of universality was present in 
theories constructed according to the (electromagnetic) gauge principle. We 
now see how the universality survives renormalization. In volume 2 we shall 
find that a similar universality holds, empirically, in the case of the weak 
interaction, giving a strong indication that this force too should be described 
by a renormalizable gauge theory. 


a 
11.7 The anomalous magnetic moment and tests of QED 


Returning now to ri, just as in section 11.5.2 we regarded the vacuum po- 
larization correction 1 + 411"! as a contribution to the fermion’s charge form 
factor Fı(q?), so we may expect that the vertex correction will also contribute 
to the form factor. Indeed, let us recall the general form of the electromagnetic 


vertex for a spin-4 particle (cf (8.208)): 


—ieū(p', s’) | Fi(q@?) yu + T aT u(p, s) (11.79) 
where « is the ‘anomalous’ part of the magnetic moment, i.e. the magnetic 
moment is (eħ/2m)(1 + «), the ‘1’ being the Dirac value calculated in sec- 
tion 3.5. In (11.79), Fı and F are each normalized to 1 at q? = 0. Our 
vertex rP contributes to both the charge and the magnetic moment form 
factors; let us call the contributions FP and KFP. Now the Z4 counter term 
multiplies y,,, and therefore clearly cancels a divergence in Fel, Is there also, 
we may ask, a divergence in KFP? 

Actually, KF}! is convergent, and this is highly significant to the physics of 
renormalization. Had it been divergent, we would either have had to abandon 
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FIGURE 11.9 
Contribution (which is finite) to yy > yy. 


the theory or introduce a new counter term to cancel the divergence. This 
counter term would have the general form 


Ks RA 
ow E”; (11.80) 


it is, indeed, an ‘anomalous magnetic moment’ interaction. But no such term 
exists in the original QED Lagrangian (11.1)! Its appearance does not seem 
to follow from the gauge principle argument, even though it is, in fact, gauge 
invariant. Part of the meaning of the renormalizability of QED (or any the- 
ory) is that all infinities can be cancelled by counter terms of the same form as 
the terms appearing in the original Lagrangian. This means, in other words, 
that all infinities can be cancelled by assuming an appropriate cut-off depen- 
dence for the fields and parameters in the bare Lagrangian. The interaction 
(11.80) is certainly gauge invariant — but it is non-renormalizable — as we 
shall discuss further later. The message is that, in a renormalizable theory, 
amplitudes which do not have counterparts in the interactions present in the 
bare Lagrangian must be finite. Figure 11.9 shows another example of an 
amplitude which turns out to be finite: there is no ‘Ay type of interaction in 
QED (cf figure 10.13 (a) and the attendant comment in section 10.5). 

The calculation of the renormalized Fı (q?) and of KF2(q?) is quite labo- 
rious, not least because three denominators are involved in the re? integral 
(11.67). The dedicated reader can follow the story in section 6.3 of Peskin and 
Schroeder (1995). The most important result is the value obtained for «, the 
QED-induced anomalous magnetic moment of the fermion, first calculated by 
Schwinger (1948a). He obtained 


a 
e = — 7 0.001 1614 11.81 
DEFF (11.81) 
which means a g-factor corrected from the g = 2 Dirac value to 


g=2+Ž (11.82) 


350 11. Loops and Renormalization I: QED 
or, equivalently, 
[Cg — 2)/2]scnwinger = = ~ 0.0011614. (11.83) 


Note that since « is a dimensionless quantity, it cannot depend on the mass m 
of the internal fermion in (11.66). Contributions from two-loop (and higher) 
diagrams can involve different leptons in internal lines, and hence can depend 
on lepton mass ratios. 

The prediction (11.83) may be compared with the experimental values 
which are, for the electron (Hanneke et al. 2008) 


te expt = [(ge — 2)/2]expt = 115 965 218 0.73 (0.28) x 1071? [0.24 ppb] (11.84) 
and for the muon (Bennett et al. 2006) 
ap, expt = [(Gu — 2)/2]expt = 116 592 080 (63) x 107+! [0.54 ppm], (11.85) 


where the bracketed figures are the quoted uncertainties (statistical and sys- 
tematic combined in quadrature). Of course, in Schwinger’s day the exper- 
imental accuracy was far different, but there was a real discrepancy (Kusch 
and Foley 1947) with the Dirac value (a = 0). Schwinger’s one-loop calcu- 
lation provided a fundamental early confirmation of QED, and was the start 
of a long confrontation between theory and experiment which still continues. 
The interested reader is referred to the extensive review by Jegerlehner and 
Nyffeler (2009), upon which we shall draw in the following. 

The extraordinarily precise values in (11.84) and (11.85) represent the 
result of ever more sophisticated and imaginative experimentation. The mea- 
surement of de, expt is some 2250 times more accurate than that of ap, exp. Yet 
the latter is capable of probing the Standard Model more deeply, for an inter- 
esting reason. Consider expanding the vacuum polarization formula (11.18) 
in powers of m/A, having done the momentum integrals as in (10.51) and 
removed the In A divergence by the subtraction (11.33). The resulting expres- 
sion will be finite as A — oo, but for finite A it will contain A-dependent 
terms, the first being of order (m?/A?). This suggests that the contribution 
of a ‘beyond QED physics’ scale to Gy,theory (modelled crudely by our cut-off) 
would be enhanced by a factor (m,,/me)”? ~ 43000 relative to its contribu- 
tion to de theory:” This outweighs by a factor of 19 the greater experimental 
accuracy in te exp- 

This is both good news and bad news. We may distinguish three distinct 
contributions to ‘beyond QED physics’ in de,theory and Gy,theory: (i) SM weak 
interactions; (ii) SM strong (or hadronic) interactions; (iii) beyond the SM 
physics. Representative diagrams contributing to (i) and (ii) are shown in 
figure 11.10 (a) and (b) respectively. Sensitivity of ag theory to effects under (i) 
is welcome, since they are calculable, and in principle may provide precision 


1The sensitivity would be even greater for a; of course, but the very short lifetime of 
the 7 precludes an accurate measurement of its magnetic moment, at present. 
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FIGURE 11.10 
‘Beyond QED’ contributions to a? theory (L = e, 4) due to (a) weak and (b) 
strong interaction corrections. 


tests of the theory. Effects under (ii), however, are difficult to control, and 
may limit the precision of the theoretical prediction — and hence the capacity 
to discern the appearance of ‘beyond the SM physics’. 

In the case of Ge,theory, it turns out that the sensitivity to effects under (i) 
and (ii) is very small. This allows for an essentially pure QED high precision 
prediction of ae. The accuracy of the experimental number requires calculation 
of QED corrections up to 8th order — i.e. terms proportional to (a/7)*, which 
contain 4 loops; there are 891 such diagrams. Their contribution has been 
calculated by numerical methods by Kinoshita and collaborators (Aoyama et 
al. 2007, 2008; Kinoshita and Nio 2006), who have also estimated the 10th 
order (5-loop) contributions. To compare with experiment, a value of the fine 
structure constant a is required. The most accurate value currently quoted is 
(Bouchendira et al. 2011) 


a! = 137.035 999 037 (91) [0.66 ppb]. (11.86) 
With this a the theoretical (QED) prediction of ae is 


a@ED = 115 965 218 1.13 (0.11) (0.37) (0.77) x 107)? (11.87) 


e, theory 


where the first, second, and third uncertainties come from the calculated 8th 
order terms, the 10th order estimate, and the fine structure constant (11.86). 
The theory is thus in good agreement with experiment, at an extraordinary 
level of precision: 


de expt — A theory = —0-40 (0.88) x 107”. (11.88) 


The QED part of the Standard Model is indeed the paradigm quantum field 
theory. Further progress will depend on the evaluation of the 10th order 
(5-loop) terms. 
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Turning now to Gy,theory, the ‘pure QED’ part has been evaluated up to 
4 loops and estimated at the 5-loop level, with the result (Jegerlehner and 
Nyffeler 2009) 


aQFP = 116 584 718.1 (0.2) x 1071 (11.89) 


pi, theory 


where the error results from the uncertainties in the lepton mass ratios, the 
numerical error in the a* terms, the estimated uncertainty in the a° terms, 
and the uncertainty in the value of a, which in (11.89) is determined from 


đe expt- There are also electroweak and hadronic contributions, ae and 
had. 


Githeory: Lhe first of these has been evaluated up to 2 loops, and the 3-loop 
effects are negligible; the result is (Jegerlehner and Nyffeler 2009) 


ak teory = 153.2 (1.8) x 107+. (11.90) 
Ai is considerably larger, and has larger uncertainties. Its value is the 


subject of intensive ongoing theoretical effort, and is likely to be regularly 
updated. Here we give the value arrived at by Jegerlehner and Nyffeler (2009), 
namely 

qi ory = 6918.8 (65) x 1071. (11.91) 


H,theory 


Adding together (11.89), (11.90) and (11.91) gives the Standard Model pre- 
diction 
= 116 591 790.1 (65) x 107". (11.92) 


SM 
Qi theory 


It is worth stressing that all of the Standard Model (electromagnetic, weak 
and strong theories) is needed for the result (11.92); it is also interesting that 
the theoretical error is essentially the same as the experimental one, at this 
stage. 

Comparison of (11.92) and (11.85) yields 


dius ces 0, thesey = 290 (90) x10 (11.93) 


Equation (11.93) represents a discrepancy of some 3 standard deviations. This 
discrepancy between experiment and the SM prediction has persisted now for 
a number of years, and is one of the very few significant (at this level) such 
discrepancies. While it may be premature to conclude that a, can definitely 
not be understood without some ‘beyond the SM’ physics, many such possi- 
bilities are reviewed by Jegerlehner and Nyffeler (2009). No doubt this epic 
confrontation between theory and experiment will continue to be pursued: it 
is a classic example of the way in which a very high-precision measurement 
in a thoroughly ‘low-energy’ area of physics (a magnetic moment) can have 
profound impact on the ‘high-energy’ frontier — a circumstance we may be 
increasingly dependent upon. 

One conclusion we can certainly draw is that renormalizable quantum field 
theories are the most predictive theories we have. We end this volume with 
some general reflections on renormalizable, and non-renormalizable, theories. 
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11.8 Which theories are renormalizable — and does it 
matter? 


In the course of our travels thus far, we have met theories which exhibit 
three different types of ultraviolet behaviour. In the ABC theory at one-loop 
order, we found that both the field strength renormalizations and the vertex 
correction were finite; only the mass shifts diverged as A + oo. The theory was 
called ‘super-renormalizable’. In QED, we needed divergent renormalization 
constants Z; as well as an infinite mass shift — but (although we did not 
attempt to explain why) these counter terms were enough to cure divergences 
systematically to all orders and the theory was renormalizable. Finally, we 
asserted that the anomalous coupling (11.80) was non-renormalizable. In the 
final section of this volume we shall try to shed more light on these distinctions 
and their significance. 

Is there some way of telling which of these ultraviolet behaviours a given 
Lagrangian is going to exhibit, without going through the calculations? The 
answer is yes (nearly), and the test is surprisingly simple. It has to do with the 
dimensionality of a theory’s coupling constant. We have seen (section 6.3.1) 
that the dimensionality of ‘g’ in the ABC theory is M! (using mass as the 
remaining dimension when h = c = 1), that of e in QED is M? (section 7.4) 


and that of the coefficient of the anomalous coupling Vow EH in (11.80) 
is M~'. These couplings have positive, zero and negative mass dimension, 
respectively. It is no accident that the three theories, with different dimensions 
for their couplings, have different ultraviolet behaviour and hence different 
renormalizability. 

That coupling constant dimensionality and ultraviolet behaviour are re- 
lated can be understood by simple dimensional considerations. Compare, for 
example, the vertex corrections in the ABC theory (figure 10.6) and in QED 
(figure 11.8). These amplitudes behave essentially as 


dk 
GP ~ gPa J me (11.94) 
and 
dk 
rl ~e | Sn (11.95) 


respectively, for large k. Both are dimensionless: but in (11.94) the positive 
(mass)? dimension of gô, is compensated by two additional factors of k? in 
the denominator of the integral, as compared with (11.95), with the result 
that (11.94) is ultraviolet convergent but (11.95) is not. The analysis can be 
extended to higher-order diagrams: for the ABC theory, the more powers of 
gpn Which are involved, the more denominator factors are necessary, and hence 
the better the convergence is. Indeed, in this kind of ‘super-renormalizable’ 
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theory, only a finite number of diagrams are ultraviolet divergent, to all orders 
in perturbation theory. 

It is clear that some kind of opposite situation must obtain when the 
coupling constant dimensionality is negative; for then, as the order of the per- 
turbation theory increases, the negative powers of M in the coupling constant 
factors must be compensated by positive powers of k in the numerators of 
loop integrals. Hence the divergence will tend to get worse at each successive 
order. A famous example of such a theory is Fermi’s original theory of G-decay 
(Fermi 1934a, b), referred to in section 1.3.5, in which the interaction density 
has the ‘four-fermion’ form 


OAOA OTAC (11.96) 


where Gr is the ‘Fermi constant’. To find the dimensionality of Gr, we first 


establish that of the fermion field by considering a mass term mary), for exam- 
ple. The integral of this over d°x gives one term in the Hamiltonian, which has 


dimension M. We deduce that [7] = 3, since [d°a] = —3. Hence [Wud] = 6, 
and so [Gr] = —2. The coupling constant Gp in (11.96) therefore has a neg- 
ative mass dimension, just like the coefficient K/m in (11.80). Indeed, the 
four-fermion theory is also non-renormalizable. 

Must such a theory be rejected? Let us briefly sketch the consequences of 
an interaction of the form (11.96), but slightly simpler, namely 


Grd, (0) bn (2) b,, (2) bv. (2) (11.97) 


where, for the present purposes, the neutron is regarded as point-like. Con- 
sider, for example, the scattering process Ve +n —> Ve +n. To lowest order 
in Gp, this is given by the tree diagram — or ‘contact term’ — of figure 11.11, 
which contributes a constant —iG'p to the invariant amplitude for the process, 
disregarding the spinor factors for the moment. A one-loop O(G) correction 
is shown in figure 11.12. Inspection of figure 11.12 shows that this is an s- 
channel process (recall section 6.3.3): let us call the amplitude Gyo" (s), 
where one Gr factor has been extracted, so that the correction can be com- 
pared with the tree amplitude and GP (s) is dimensionless. Then GPl(s) is 
given by 


Mey ig, fk i S S 
G(s) = Ge | Sea Ee (11.98) 


As expected, the negative mass dimension of Gp leaves fewer k-factors in the 
denominator of the loop integral. Indeed, manipulations exactly like those 
we used in the case of XI] shows that GPl(s) has a quadratic divergence, 
and that aG” /ds has a logarithmic divergence. The extra denominators 


associated with second and higher derivatives of GP! (s) are sufficient to make 
these integrals finite. 
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FIGURE 11.11 
Lowest order contribution to Ve +n —> Ve +n in the model defined by the 
interaction (11.97). 


FIGURE 11.12 
Second-order (one-loop) contribution to Ve +n > Ve +n. 


The standard procedure would now be to cancel these divergences with 
counter terms. There will certainly be one counter term arising naturally 
from writing the bare version of (11.97) as (cf (11.5)): 


GorVonVonVor, dor. = Grby indy, tr, + (Za—UGrbdnt,, ty, (11.99) 


where Z24Gp = Gop Z2nZ2,,, and the Z2’s are the field strength renormaliza- 
tion constants for the n and v, fields. Including the tree graph of figure 11.11, 
the amplitude of figure 11.12, and the counter term, the total amplitude to 
O(G?2) is given by 


iM = -iGp —iGpG?!(s) —iGp (Z4 — 1). (11.100) 


As in our earlier examples, Z4 will be determined from a renormalization 
condition. In this case, we might demand, for example, that the amplitude 
M reduces to Gr at the threshold value s = so, where so = (Mn + m,,)?. 
Then to O(G) we find 


ZP! = 1 — GPl(s0) (11.101) 
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and our amplitude (11.100) is, in fact, 
i; _; 2l _ al2] 
iGp — iGp[G;'(s) — G7 (so). (11.102) 


In (11.102), we see the familiar outcome of such renormalization — the 
appearance of subtractions of the divergent amplitude (cf (10.74), (11.11), 
(11.33) and (11.70)). In fact, because dal /ds is also divergent, we need a 
second subtraction — and correspondingly, a new counter term, not present in 
the original Lagrangian, of the form 


Gabin, biu 


for example; there will also be others, but we are concerned only with the gen- 
eral idea. The occurrence of such a new counter term is characteristic of a non- 
renormalizable theory, but at this stage of the proceedings the only penalty 
we pay is the need to import another constant from experiment, namely the 
value D of dG? /ds at some fixed s, say s = sg; D will be related to the 
renormalized value of Gg. We will then write our renormalized amplitude, up 
to 0(G2), as 

=iGs(l + De = so) HEP (a) (11.103) 


where GP (s) is finite, and vanishes along with its first derivative at s = sọ; 
that is, GP (s) contributes calculable terms of order (s — sọ)? if expanded 
about s = sọ. 

The moral of the story so far, then, is that we can perform a one-loop 
renormalization of this theory, at the cost of taking additional parameters 
from experiments and introducing new terms in the Lagrangian. What about 
the next order? Figure 11.13 shows a two-loop diagram in our theory, which is 
of order GÌ. Writing the amplitude as meet (s), the ultraviolet behaviour 


d*k,d*k 
(ice? [AS (11.104) 


where k is a linear function of kı and k2. This has a leading ultraviolet 


of G” (s) is given by 


divergence ~ Af, even worse than that of GP As suggested earlier, it is 
indeed the case that, the higher we go in perturbation theory in this model, 
the worse the divergences become. We can, of course, eliminate this divergence 
in GP by performing a further subtraction, requiring the provision of more 
parameters from experiment. By now the pattern should be becoming clear: 
new counter terms will have to be introduced at each order of perturbation 
theory, and ultimately we shall need an infinite number of them, and hence 
an infinite number of parameters determined from experiment — and we shall 
have zero predictive capacity. 

Does this imply that the theory is useless? We have learned that GPl(s) 
produces a calculable term of order GÊ (s — so)? when expanded about s = so; 
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FIGURE 11.13 
A two-loop contribution to Ve + n — Ve +n in the model defined by (11.97). 


and that Gel will produce a calculable term of order G(s — so)?, and so on. 
Now, from the discussion after (11.96), Gr itself is a dimensionless number di- 
vided by the square of some mass. As we saw in section 1.3.5 (and will return 
to in more detail in volume 2), in the case of the physical weak interaction 
this mass in Gp is the W-mass, and Gr ~ a /M2,. Hence our loop corrections 
have the form a?(s — so)? /M&, a? (s — so)? /M¢,.... We now see that for low 
enough energy close to threshold, where (s — so) < M¢,, it will be a good 
approximation to stop at the one-loop level. As we go up in energy, we will 
need to include higher-order loops, and correspondingly more parameters will 
have to be drawn from experiment. But only when we begin to approach an 
energy V/s ~ Mw/va ~ Gc. ~ 300 GeV will this theory be terminally sick. 
This was pointed out by Heisenberg (1939). For this argument to work, it is 
important that the ultraviolet divergences at a given order in perturbation 
theory (i.e. a given number of loops) should have been removed by renormal- 
ization, otherwise factors of A? will enter — in place of the (s — sq) factors, for 
example. 

We have seen that a non-renormalizable theory can be useful at energies 
well below the ‘natural’ scale specified by its coupling constant. Let us look at 
this in a slightly different way, by considering the two four-fermion interaction 
terms introduced at one loop, 


Grbatndy, dy, and Gath Phin, db. (11.105) 


We know that Gr ~ Mg, and similarly Gg ~ Mz (from dimensional count- 
ing, or from the association of the Gq term with the O(G?) counter term). 
From dimensional analysis, or by referring to (11.103) and remembering that 
D is of order Gp for consistency, we see that the second term in (11.105), when 
evaluated at tree level, is of order (s — s9)/M¢, times the first. It follows that 
higher derivative interactions, and in general terms with successively larger 
negative mass dimension, are increasingly suppressed at low energies. 
Where, then, do renormalizable theories fit into this? Those with cou- 
plings having positive mass dimension (‘super-renormalizable’) have, as we 
have seen, a limited number of infinities and can be quickly renormalized. 
The ‘merely renormalizable’ theories have dimensionless coupling constants, 
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such as e (or a). In this case, since there are no mass factors (for good or ill) 
to be associated with powers of a, as we go up in order of perturbation theory 
it would seem plausible that the divergences get essentially no worse, and can 
be cured by the counter terms which compensated those simplest divergences 
which we examined in earlier sections — though for QED the proof is difficult, 
and took many years to perfect. 

Given any renormalizable theory, such as QED, it is always possible to 
suppose that the ‘true’ theory contains additional non-renormalizable terms, 
provided their mass scale is very much larger than the energy scale at which 
the theory has been tested. For example, a term of the form (11.80) with 
‘K/m’ replaced by some very large inverse mass M~! would be possible, and 
would contribute an amount of order 4e/M to a lepton magnetic moment. 
The present level of agreement between theory and experiment in the case of 
the electron’s moment implies that M > 4 x 10° GeV. 

From this perspective, then, it may be less of a mystery why renormal- 
izable theories are generally the relevant ones at presently posed energies. 
Returning to the line of thought introduced in section 10.1.1, we may imag- 
ine that a ‘true’ theory exists at some enormously high energy A (the Planck 
scale?) which, though not itself a local quantum field theory, can be written 
in terms of all possible fields and their couplings, as allowed by certain sym- 
metry principles. Our particular renormalizable subset of these theories then 
emerges as a low-energy effective theory, due to the strong suppression of the 
non-renormalizable terms. Of course, for this point of view to hold, we must 
assume that the latter interactions do not have ‘unnaturally large’ couplings, 
when expressed in terms of A. 

This interpretation, if correct, deals rather neatly with what was, for many 
physicists, an awkward aspect of renormalizable theories. On the one hand, 
it was certainly an achievement to have rendered all perturbative calculations 
finite as the cut-off went to infinity; but on the other, it was surely unreason- 
able to expect any such theory, established by confrontation with experiments 
in currently accessible energy regimes, really to describe physics at arbitrarily 
high energies. On the ‘low-energy effective field theory’ interpretation, we can 
enjoy the calculational advantages of renormalizable field theories, while ac- 
knowledging — with no contradiction — the likelihood that at some scale ‘new 
physics’ will enter. 

Having thus argued that renormalizable theories emerge ‘naturally’ as low- 
energy theories, we now seem to be faced with another puzzle: why were weak 
interactions successfully describable, for many years, in terms of the non- 
renormalizable four-fermion theory? The answer is that non-renormalizable 
theories may be physically detectable at low energies if they contribute to 
processes that would otherwise be forbidden. For example, the fact that (as 
far as we know) neutrinos have neither electromagnetic nor strong interactions, 
but only weak interactions, allowed the four-fermion theory to be detected — 
but amplitudes were suppressed by powers of s/M{¥, (relative to comparable 
electromagnetic ones) and this was, indeed, why it was called ‘weak’! 


11.8. Which theories are renormalizable — and does it matter? 359 


FIGURE 11.14 
One-Z (Yukawa-type) exchange process in Ve +n > Ve +n. 


In the case of the weak interaction, the reader may perhaps wonder why — if 
it was understood that the four-fermion theory could after all be handled up to 
energies of order 10 GeV — so much effort went in to creating a renormalizable 
theory of weak interactions, as it undoubtedly did. Part of the answer is that 
the utility of non-renormalizable interactions was a rather late realization (see, 
for example, Weinberg 1979). But surely the prospect of having a theory with 
the predictive power of QED was a determining factor. At all events, the 
preceding argument for the ‘naturalness’ of renormalizable theories as low- 
energy effective theories provides strong expectation that such a description 
of weak interactions should exist. 

We shall discuss the construction of the currently accepted renormalizable 
theory of electroweak interactions in volume 2. We can already anticipate 
that the first step will be to replace the ‘negative-mass-dimensioned’ constant 
Gr by a dimensionless one. The most obvious way to do this is to envisage 
a Yukawa-type theory of weak interactions mediated by a massive quantum 
(as, of course, Yukawa himself did — see section 1.3.5). The four-fermion 
process of figure 11.11 would then be replaced by that of figure 11.14, with 
amplitude (omitting spinors) ~ g3/(q? — m?,) where gz is dimensionless. For 
small q? < m#, this reduces to the contact four-fermion form of figure 11.11, 
with an effective Gp ~ g?/m%, showing the origin of the negative mass di- 
mensions of Gp. It is clear that even if the new theory were to be renor- 
malizable, many low-energy processes would be well described by an effective 
non-renormalizable four-fermion theory, as was indeed the case historically. 

Unfortunately, we shall see in volume 2 that the application of this simple 
idea to the charge-changing weak interactions does not, after all, lead to a 
renormalizable theory. This teaches us an important lesson: a dimensionless 
coupling does not necessarily guarantee renormalizability. 
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To arrive at a renormalizable theory of the weak interactions it seems to be 
necessary to describe them in terms of a gauge theory (recall the ‘universality’ 
hints mentioned in section 11.6). Yet the mediating gauge field quanta have 
mass, which appears to contradict gauge invariance. The remarkable story of 
how gauge field quanta can acquire mass while preserving gauge invariance is 
reserved for volume 2. 

A number of other non-renormalizable interactions are worth mentioning. 
Perhaps the most famous of all is gravity, characterized by Newton’s constant 
Gn, which has the value (1.2 x 101° GeV)~?. The detection of gravity at ener- 
gies so far below 10!° GeV is due, of course, to the fact that the gravitational 
fields of all the particles in a macroscopic piece of matter add up coherently. 
At the level of the individual particles, its effect is still entirely negligible. 
Another example may be provided by baryon and/or lepton violating interac- 
tions, mediated by highly suppressed non-renormalizable terms.” Such things 
are frequently found when the low-energy limit is taken of theories defined 
(for example) at energies of order 1016 GeV or higher. 

The stage is now set for the discussion, in volume 2, of the renormalizable 
non-Abelian gauge field theories which describe the weak and strong sectors 
of the Standard Model. 


a 
Problems 
11.1 Establish the values of the counter terms given in (11.12). 


11.2 Convince yourself of the rule ‘each closed fermion loop carries an addi- 
tional factor —1’. 


11.3 Explain why the trace is taken in (11.14). 
11.4 Verify (11.15). 


11.5 Verify the quoted relation P?P7 = P? where PP = g£ — qqu/q? (cf 
(11.26)). 


11.6 Verify (11.39 ) for q? < m?. 
11.7 Verify (11.55 ) for —q? > m?. 
11.8 Check the estimate (11.60). 


11.9 Find the dimensionality of ‘F’ in an interaction of the form E(F mi ukya 
Express this interaction in terms of the È and B fields. Is such a term finite 
or infinite in QED? How might it be measured? 


2The most general renormalizable Lagrangian with the field content, and the gauge 
symmetries, of the Standard Model automatically conserves baryon and lepton number 
(Weinberg 1996, pp 316-7). 


A 


Non-relativistic Quantum Mechanics 


This appendix is intended as a very terse ‘revision’ summary of those aspects 
of non-relativistic quantum mechanics that are particularly relevant for this 
book. A fuller account may be found in Mandl (1992), for example. 

Natural units i = c = 1 (see appendix B). 

Fundamental postulate of quantum mechanics: 


(bi, ĉj] = —1d,;. (A.1) 


Coordinate representation: 


_P ly 
H= om +V (A.4) 
and so i E, 
2 7 _: T, t 
Probability density and current (see problem 3.1 (a)): 
p= vp = |v) 20 (A.6) 
j = (VY) - (VWV (A.7) 
with 3 
p = 
DE +V-g= (A.8) 
Free-particle solutions: 
d(x, t) = u(x)e1# (A.9) 
Hou = Eu (A.10) 
where . a 
Ho = H(V = 0). (A.11) 
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Box normalization: 
f u* (a)u(a) d?a = 1. (A.12) 
V 


Angular momentum: Three Hermitian operators Gl, des J) satisfying 
[Joy Sy] = itt, 


and corresponding relations obtained by rotating the x—-y-—z subscripts. The 
result [J i J] = 0 implies complete sets of states exist with definite values of 
J and J,. Eigenvalues of F are (with A = 1) j(j +1) where j =0,4,1,...; 
eigenvalues of va are m where —j < m < j, for given j. For orbital angular 
momentum, J — L = r x p and eigenfunctions are spherical harmonics 
Yom (0, Q), for which eigenvalues of È and Ê, are I(l +1) and m where —/ < 
m < l. For spin-4 angular momentum, J —> žo where the Pauli matrices 
0 = (Ox, Oy, 0z) are 


Be o) T a sa a (A.13) 


Eigenvectors of s, are E (eigenvalue +4), and @ (eigenvalue —4). 


Interaction with electromagnetic field: Particle of charge q in electromag- 
netic vector potential A 


p— p—aqA}. (A.14) 
Thus 
(p- 9A)y = (A.15) 
and so 
-— Vy tila.vyt É ay = (A.16) 


Note: (i) chosen gauge V - A = 0; (ii) q? term is usually neglected. 

Example: Magnetic field along z-axis, possible A consistent with V -A = 0 
is A = 4B(—y, x, 0) such that V x A = (0,0, B). Inserting this into the second 
term on left-hand side of (A.16) gives 


a (2 m +>) zs ig (A.17) 


which generalizes to the standard orbital magnetic moment interaction —f - 
Bw where 


pet. (A.18) 


A. Non-relativistic Quantum Mechanics 


Time-dependent perturbation theory: 


Ê = o+ Ý 
n Ow 
Hy =i—. 
vale 
Unperturbed problem: 
Houn = Epün 


Completeness: 


p(x, t) = X` an(t)jun(e)e E". 


First-order perturbation theory: 


ag = -i f da dt uf (xet tY (æ, t)ui(æ)e it 


which has the form 
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(A.19) 
(A.20) 


(A.21) 


(A.22) 


(A.23) 


an = —i / (volume element) (final state)“ (perturbing potential) (initial state) 


Important examples: 
(i) V independent of t: 
ag = —iVa2rð (Er — Ei) 


where 


Vi / Par uf (a2) 7 (æ)ui (2). 


(ii) Oscillating time-dependent potential: 


(a) if V ~ e~*, time integral of ag is 
fa etiEste-iwte-iEit — on 6( Ep — Ej — w) 


i.e. the system has absorbed energy from potential; 
(b) if V ~ et, time integral of ag is 


fa etiErtgtwto iiit — 2rô(Es +w — Ei) 


i.e. the potential has absorbed energy from system. 


(A.24) 


(A.25) 


(A.26) 


(A.27) 


(A.28) 
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Absorption and emission of photons: For electromagnetic radiation, far 
from its sources, the vector potential satisfies the wave equation 


V7A- oA =0. (A.29) 
Solution: 
A(x, t) = Ao exp(—iwt + ik - x) + Aj exp(+iwt — ik - æ). (A.30) 
With gauge condition V - A = 0 we have 
k- Ap =0 (A.31) 


and there are two independent polarization vectors for photons. 
Treat the interaction in first-order perturbation theory: 


V (a, t) = (iq/m)A(a, t)- V. (A.32) 
Thus 


Ap exp(—iwt + ik - x) absorption of photon of energy w 


Aj exp(t+iwt+ik-a) = emission of photon of energy w. (A.33) 


B 


Natural Units 


In particle physics, a widely adopted convention is to work in a system of 
units, called natural units, in which 


h=c= 1. (B.1) 


This avoids having to keep track of untidy factors of A and c throughout a 
calculation; only at the end is it necessary to convert back to more usual units. 
Let us spell out the implications of this choice of c and A. 

(i) c= 1. In conventional MKS units c has the value 


c=3 x 108 me B.2) 


By choosing units such that 
c=1 B.3) 


since a velocity has the dimensions 


[e] = [L][T] > B.4) 


we are implying that our unit of length is numerically equal to our unit of 
time. In this sense, length and time are equivalent dimensions: 


[L] = [1]. (B.5) 
Similarly, from the energy-momentum relation of special relativity 
E? = pe + mt (B.6) 


we see that the choice of c = 1 also implies that energy, mass and momentum 

all have equivalent dimensions. In fact, it is customary to refer to momenta 

in units of ‘MeV/c’ or ‘GeV/c’; these all become ‘MeV’ or ‘GeV’ when c= 1. 
(ii) A = 1. The numerical value of Planck’s constant is 


h = 6.6 x 10-7? MeV s (B.7) 

and A has dimensions of energy multiplied by time so that 
[i] = (MILE). (B.8) 
Setting A = 1 therefore relates our units of [M], [L] and [T]. Since [L] and 
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[T] are equivalent by our choice of c = 1, we can choose [M] as the single 
independent dimension for our natural units: 


M] = [L] = [T]. (B.9) 


An example: the pion Compton wavelength How do we convert from natu- 
ral units to more conventional units? Consider the pion Compton wavelength 


Ar = h/Mrc (B.10) 
evaluated in both natural and conventional units. In natural units 
Az = 1/Mr (B.11) 


where M, ~ 140 MeV/c?. In conventional units, using M,,h (B.T) and c 
(B.2), we have the familiar result 


Ag = 1.41 fm (B.12) 
where the ‘fermi’ or femtometre, fm, is defined as 
1 fm = 107" m. 


We therefore have the correspondence 


àr = 1/M, = 1.41 fm. (B.13) 


Practical cross section calculations: An easy-to-remember relation may be 


derived from the result 
he ~ 200 MeV fm (B.14) 


obtained directly from (B.2) and (B.7). Hence, in natural units, we have the 
relation 


1 —1 


1 fm >~ 


Cross sections are calculated without f’s and c’s and all masses, energies and 
momenta typically in MeV or GeV. To convert the result to an area, we merely 
remember the dimensions of a cross section: 


[o] = {L}? = [M]. (B.16) 
If masses, momenta and energies have been specified in GeV, from (B.15) we 


derive the useful result (from the more precise relation ic = 197.328 MeV fm) 


1 y e 
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where a millibarn, mb, is defined to be 
1 mb = 107*! m?. 


Note that a ‘typical’ hadronic cross section corresponds to an area of about 


A2 where 
2 = 1/M2 = 20 mb. 


Electromagnetic cross sections are an order of magnitude smaller: specifically 
for lowest order ete — pt u` 


oa — nb (B.18) 


where s is in (GeV)? (see problem 8.18(d) in chapter 8). 
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C 


Maxwell’s Equations: Choice of Units 


In high-energy physics, it is not the convention to use the rationalized MKS 
system of units when treating Maxwell’s equations. Since the discussion is 
always limited to field equations in vacuo, it is usually felt desirable to adopt 
a system of units in which these equations take their simplest possible form 
— in particular, one such that the constants €o and uo, employed in the MKS 
system, do not appear. These two constants enter, of course, via the force 
laws of Coulomb and Ampére, respectively. These laws relate a mechanical 
quantity (force) to electrical ones (charge and current). The introduction of 
co in Coulomb’s law 

VIKA 
m 4reor? 


(C.1) 


enables one to choose arbitrarily one of the electrical units and assign to it 
a dimension independent of those entering into mechanics (mass, length and 
time). If, for example, we use the coulomb as the basic electrical quantity 
(as in the MKS system), eo has dimension (coulomb)? [T]?/[{MJ[L]°. Thus 
the common practical units (volt, ampére, coulomb, etc) can be employed 
in applications to both fields and circuits. However, for our purposes this 
advantage is irrelevant, since we are only concerned with the field equations, 
not with practical circuits. In our case, we prefer to define the electrical units 
in terms of mechanical ones in such a way as to reduce the field equations to 
their simplest form. The field equation corresponding to (C.1) is 


V-E=p/e (Gauss’ law: MKS) (C.2) 


and this may obviously be simplified if we choose the unit of charge such that €o 
becomes unity. Such a system, in which CGS units are used for the mechanical 
quantities, is a variant of the electrostatic part of the ‘Gaussian CGS’ system. 
The original Gaussian system set co — 1/47, thereby simplifying the force 
law (C.1), but introducing a compensating 47 into the field equation (C.2). 
The field equation is, in fact, primary, and the 47 is a geometrical factor 
appropriate only to the specific case of three dimensions, so that it should 
not appear in a field equation of general validity. The system in which €o in 
(C.2) may be replaced by unity is called the ‘rationalized Gaussian CGS’ or 
‘Heaviside—Lorentz’ system: 


V- E=p (Gauss’ law; Heaviside—Lorentz). (C.3) 
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Generally, systems in which the 47 factors appear in the force equations rather 
than the field equations are called ‘rationalized’. 

Of course, (C.3) is only the first of the Maxwell equations in Heaviside— 
Lorentz units. In the Gaussian system, uo in Ampére’s force law 


= f fahm Ge 112) asedii (C4) 


Tio 


was set equal to 47, thereby defining a unit of current (the electromagnetic 
unit or Biot (Bi emu)). The unit of charge (the electrostatic unit or Franklin 
(Fr esu)) has already been defined by the (Gaussian) choice eo = 1/47 and 
currents via 49 — 47, and c appears explicitly in the equations. In the 
rationalized (Heaviside—Lorentz) form of this system, €o > 1 and fo > 1, and 
the remaining Maxwell equations are 


V-B= (C.6) 
13E 


A further discussion of units in electromagnetic theory is given in Panofsky 
and Phillips (1962, appendix I). 

Finally, throughout this book we have used a particular choice of units for 
mass, length and time such that h = c = 1 (see appendix B). In that case, the 
Maxwell equations we use are as in (C.3), (C.5)-(C.7), but with c replaced by 
unity. 

As an example of the relation between MKS and the system employed in 
this book (and universally in high-energy physics), we remark that the fine 
structure constant is written as 


e2 


= Å i MK it 7 
a in S units (C.8) 


or as 


a= — in Heaviside—Lorentz units with A = c = 1. (C.9) 


Clearly the value of a(~ 1/137) is the same in both cases, but the numerical 
values of ‘e’ in (C.8) and in (C.9) are, of course, different. 

The choice of rationalized MKS units for Maxwell’s equations is a part of 
the SI system of units. In this system of units the numerical values of uo and 
€g are 

uo = 4r x 1077 (ke m C7? = Hm “*) 
and, since oeo = 1/c?, 
107 1 


A EE 2 2 pol m-3 _ = 
=a a °° = 2° >a 


D 


Special Relativity: Invariance and Covariance 


The co-ordinate 4-vector x” is defined by 


a! = (x? a x°, x°) 


where x° = t (with c = 1) and (xt, x?, x3) = a. Under a Lorentz transforma- 


tion along the xt-axis with velocity v, £” transforms to 


r” = (2° — vzr!) 

xr” = »y(-ve°+4+ 2) 

g m= 

r” = g’ (D.1) 


where y = (1 — v?)~/?, 

A general ‘contravariant 4-vector’ is defined to be any set of four quantities 
At = (A®°, Al, A?, A”) = ( A}, A) which transform under Lorentz transforma- 
tions exactly as the corresponding components of the coordinate 4-vector x". 
Note that the definition is phrased in terms of the transformation property 
(under Lorentz transformations) of the object being defined. An important 
example is the energy-momentum 4-vector p“ = (E, p), where for a parti- 
cle of rest mass m, E = (p? + m?)!/?. Another example is the 4-gradient 
Ə! = (0°, —WV) (see problem 2.1) where 


o 0 ð ə 
- Ot ™ (=. Ox?’ at) em 


Lorentz transformations leave the expression A°? — A? invariant for a general 
4-vector A“. For example, E? — p? = m? is invariant, implying that the rest 
mass m is invariant under Lorentz transformations. Another example is the 
four-dimensional invariant differential operator analogous to V7, namely 


o = 8! ? = y? 
which is precisely the operator appearing in the massless wave equation 
Ud = 0°°4-— V?’ = 0. 


The expression A°? — A’? may be regarded as the scalar product of A” with 
a related ‘covariant vector’ A, = (A°,—A). Then 


A? A? =X AMA, 
H 
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where, in practice, the summation sign on repeated ‘upstairs’ and ‘downstairs’ 
indices is always omitted. We shall often shorten the expression ‘A’ A,’ even 
further, to ‘A?’; thus p? = E? — p? = m?. The ‘downstairs’ version of 0” is 


ô, = (0°, V). Then ð,” = ð? = O. ‘Lowering’ and ‘raising’ indices is effected 
by the metric tensor g#” or gav, where g® = goo = 1, gH! = g”? = g”? = 
gi. = 922 = 933 = —1, all other components vanishing. Thus if A, = gur A” 


then Ag = A}, Ay = —Al!, etc. 
In the same way, the scalar product A- B of two 4-vectors is 


A- B= A"B, = A°B°-A-B (D.3) 


and this is also invariant under Lorentz transformations. For example, the 
invariant four-dimensional divergence of a 4-vector j” = (p,7) is 


Ə" ju = Op ( V)-G=OPpt+V -j= 8,5" (D.4) 


since the spatial part of 0" is —V. 

Because the Lorentz transformation is linear, it immediately follows that 
the sum (or difference) of two 4-vectors is also a 4-vector. In a reaction of the 
type ‘1+2—>3-+4+---N’ we express the conservation of both energy and 
momentum as one ‘4-momentum conservation equation’: 


pi +p =p3 + pg +--+ pN: (D.5) 


In practice, the 4-vector index on all the p’s is conventionally omitted in 
conservation equations such as (D.5), but it is nevertheless important to re- 
member, in that case, that it is actually four equations, one for the energy 
components and a further three for the momentum components. Further, it 
follows that quantities such as (pı +p2)°, (pı — p3)? are invariant under Lorentz 
transformations. 

We may also consider products of the form A“B”’, where A and B are 
4-vectors. As u and v each run over their four possible values (0,1, 2,3) 
16 different ‘components’ are generated (A°B°, A°B!,..., A3 B3). Under a 
Lorentz transformation, the components of A and B will transform into def- 
inite linear combinations of themselves, as in the particular case of (D.1). It 
follows that the 16 components of A” B” will also transform into well-defined 
linear combinations of themselves (try it for A°B and (D.1)). Thus we have 
constructed a new object whose 16 components transform by a well-defined 
linear transformation law under a Lorentz transformation, as did the compo- 
nents of a 4-vector. This new quantity, defined by its transformation law, is 
called a tensor — or more precisely a ‘contravariant second-rank tensor’, the 
‘contravariant’ referring to the fact that both indices are upstairs, the ‘second 
rank’ meaning that it has two indices. An important example of such a tensor 
is provided by 0 A” (x) — ©” A” (a), which is the electromagnetic field strength 
tensor F#”, introduced in chapter 2. More generally we can consider ten- 
sors BY” which are not literally formed by ‘multiplying’ two vectors together, 
but which transform in just the same way; and we can introduce third- and 
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higher-rank tensors similarly, which can also be ‘mixed’, with some upstairs 
and some downstairs indices. 

We now state a very useful and important fact. Suppose we ‘dot’ a down- 
stairs 4-vector A, into a contravariant second-rank tensor B“”, via the oper- 
ation A,,B“”, where as always a sum on the repeated index js is understood. 
Then this quantity transforms as a 4-vector, via its ‘loose’ index v. This is 
obvious if BY” is actually a product such as BY” = CD”, since then we have 
A, BY’ = (A-C)D”, and (A-C) is an invariant, which leaves the 4-vector D” 
as the only ‘transforming’ object left. But even if B“” is not such a product, 
it transforms under Lorentz transformations in exactly the same way as if it 
were, and this leads to the same result. An example is provided by the quan- 
tity 0, F”” which enters on the left-hand side of the Maxwell equations in the 
form (2.18). 

This example brings us conveniently to the remaining concept we need to 
introduce here, which is the important one of ‘covariance’. Referring to (2.18), 
we note that it has the form of an equality between two quantities (0, F"” on 
the left, j%,, on the right) each of which transforms in the same way under 
Lorentz transformations — namely as a contravariant 4-vector. One says that 
(2.18) is ‘Lorentz covariant’, the word ‘covariant’ here meaning precisely that 
both sides transform in the same way (i.e. consistently) under Lorentz trans- 
formations. Confusingly enough, this use of the word ‘covariant’ is evidently 
quite different from the one encountered previously in an expression such as 
‘a covariant 4-vector’, where it just meant a 4-vector with a downstairs index. 
This new meaning of ‘covariant’ is actually much better captured by an alter- 
native name for the same thing, which is ‘form invariant’, as we will shortly 
see. 

Why is this idea so important? Consider the (special) relativity principle, 
which states that the laws of physics should be the same in all inertial frames. 
The way in which this physical requirement is implemented mathematically 
is precisely via the notion of covariance under Lorentz transformations. For, 
consider how a law will typically be expressed. Relative to one inertial frame, 
we set up a coordinate system and describe the phenomena in question in 
terms of suitable coordinates, and such other quantities (forces, fields, etc) as 
may be necessary. We write the relevant law mathematically as equations re- 
lating these quantities, all referred to our chosen frame and coordinate system. 
What the relativity principle requires is that these relationships — these equa- 
tions — must have the same form when the quantities in them are referred to 
a different inertial frame. Note that we must say ‘have the same form’, rather 
than ‘be identical to’, since we know very well that coordinates, at least, are 
not identical in two different inertial frames (cf (D.1)). This is why the term 
‘form invariant’ is a more helpful one than ‘covariant’ in this context, but the 
latter is more commonly used. 

A more elementary example may be helpful. Consider Newton’s law in the 
simple form F = mr. This equation is ‘covariant under rotations’, meaning 
that it preserves the same form under a rotation of the coordinate system — 
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and this in turn means that the physics it expresses is independent of the 
orientation of our coordinate axes. The ‘same form’ in this case is of course 
just F’ = m7’. We emphasize again that the components of F’ are not the 
same as those of F, nor are the components of #’ the same as those of #; 
but the relationship between F” and ë’ is exactly the same as the relationship 
between F and 7, and that is what is required. 

It is important to understand why this deceptively simple result ((F’ = 
mv’) has been obtained. The reason is that we have assumed (or asserted) 
that ‘force’ is in fact to be represented mathematically as a 3-vector quantity. 
Once we have said that, the rest follows. More formally, the transformation 
law of the components of r is r; = Rijr; (sum on j understood), where the 
matrix of transformation coefficients R is ‘orthogonal’ (RR = R'R = T), 
which ensures that the length (squared) of r is invariant , r? = r°. To say 
that ‘force is a 3-vector’ then implies that the components of F transform 
by the same set of coefficients Rij: F! = R,jF;. Thus starting from the 
law Fj = mr; which relates the components in one frame, by multiplying 
both sides of the equation by R;; and summing over j we arrive at F? = mi, 
which states precisely that the components in the primed frame bear the same 
relationship to each other as the components in the unprimed frame did. This 
is the property of covariance under rotations, and it ensures that the physics 
embodied in the law is the same for all systems which differ from one another 
only by a rotation. 

In just the same way, if we can write equations of physics as equalities 
between quantities which transform in the same way (i.e. ‘are covariant’) under 
Lorentz transformations, we will guarantee that these laws obey the relativity 
principle. This is indeed the case in the Lorentz covariant formulation of 
Maxwell’s equations, given in (2.18), which we now repeat here: 0,,F"” = jën- 
To check covariance, we follow essentially the same steps as in the case of 
Newton’s equations, except that the transformations being considered are 
Lorentz transformations. Inserting the expression (2.19) for F“”, the equation 
can be written as (0,,0")A” — 0” (3 A") = 7%. The two quantities enclosed 
in parentheses are actually invariants, as was mentioned earlier. This means 
that 0,0 is equal to 0,,'0’" , and similarly 0,,A“ = 0,'A/", so that we can 
write the equation as (0/,0'")A” — 8” (0, A’) = 7%. It is now clear that if 
we apply a Lorentz transformation to both sides, A” and 0” will become A’” 
and 0!” respectively, while 7%, will become j’%,,, since all these quantities 
are 4-vectors, transforming the same way (as the 3-vectors did in the Newton 
case). Thus we obtain just the same form of equation, written in terms of the 
‘primed frame’ quantities, and this is the essence of (Lorentz transformation) 
covariance. 

Actually, the detailed ‘check’ that we have just performed is really unnec- 
essary. All that is required for covariance is that (once again!) both sides of 
equations transform the same way. That this is true of (2.18) can be seen ‘by 
inspection’, once we understand the significance (for instance) of the fact that 
the u indices are ‘dotted’ so as to form an invariant. This example should 
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convince the reader of the power of the 4-vector notation for this purpose: 
compare the ‘by inspection’ covariance of (2.18) with the job of verifying 
Lorentz covariance starting from the original Maxwell equations (2.1), (2.2), 
(2.3) and (2.8)! The latter involves establishing the rather complicated trans- 
formation law for the fields E and B (which, of course, form parts of the 
tensor F#”). One can indeed show in this way that the Maxwell equations 
are covariant under Lorentz transformations, but they are not manifestly (i.e. 
without doing any work) so, whereas in the form (2.18) they are. 
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E 


Dirac 6-Function 


Consider approximating an integral by a sum over strips Ax wide as shown 


in figure E.1: a 
f f(x) da ~ ds f(a) Ac. (E.1) 


Consider the function (x — xj) shown in figure E.2, 
5(a — 2) = { 1/Ax in the jth interval (E.2) 
0 all others 


Clearly this function has the properties 
2. fal (zi — z;)Ax = f(x) (E.3) 


and 


2 lzi — r;)Az = 1. (E.4) 


In the limit as we pass to an integral form, we might expect (applying (E.1) 
to the left-hand sides) that these equations reduce to 


[#6 — 2) ae = fa) (B.5) 


and 
T2 


ô(x — zj)dr = 1 (E.6) 
Ly 
provided that xı < £j < #2. Clearly such ‘d-functions’ can easily be general- 
ized to more dimensions, e.g. three dimensions: 


dV = dz dy dz = dr ô(r — rj) = ô(x — 2; )d(y — y;)d(z — 2). (E.7) 


Informally, therefore, we can think of the 6-function as a function that is zero 
everywhere except where its argument vanishes — at which point it is infinite 
in such a way that its integral has unit area, and equations (E.5) and (E.6) 
hold. Do such amazing functions exist? In fact, the informal idea just given 
does not define a respectable mathematical function. More properly the use 
of the ‘d-function’ can be justified by introducing the notion of ‘distributions’ 
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FIGURE E.1 
Approximate evaluation of integral. 
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FIGURE E.2 


The function 6(x — zj). 


or ‘generalized functions’. Roughly speaking, this means we can think of the 
‘ĝ-function’ as the limit of a sequence of functions, whose properties converge 
to those given here. The following useful expressions all approximate the 
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FIGURE E.3 
The function (E.10) for finite N. 


6-function in this sense: 


1 
im- for —e/2< xz < 
saj = (iae for —€/2 < x < €/2 (E.8) 
0 for |x| > €/2 
a € 
= ae ue) 
i= tee (E.10) 


N- oo 7 x 


The first of these is essentially the same as (E.2), and the second is a ‘smoother’ 
version of the first. The third is sketched in figure E.3: as N tends to infin- 
ity, the peak becomes infinitely high and narrow, but it still preserves unit 
area. 


Usually, under integral signs, d-functions can be manipulated with no dan- 
ger of obtaining a mathematically incorrect result. However, care must be 
taken when products of two such generalized functions are encountered. 
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Resumé of Fourier series and Fourier transforms 


Fourier’s theorem asserts that any suitably well-behaved periodic function 
with period L can be expanded as follows: 


f= > ner. (E.11) 
Using the orthonormality relation 
1 nR —2rima/L 27rinae/L 
= e e dz = mn (E.12) 
L Jrj 


with the Krönecker 6-symbol defined by 


1 ifm=n 
dmn = T ifmAn (E.13) 


the coefficients in the expansion may be determined: 


1 Be —2rima/L 
am = = f(a)e eS Ag. (E.14) 
L Japa 


Consider the limit of these expressions as L —> oo. We may write 


f(c)= X F,An (E.15) 
with 
Fn =e (E.16) 


and the interval An = 1. Defining 
2rn/L= k (E.17) 


and 


we can take the limit L — oo to obtain 


we T mdi 


_ © al kje® Ldk 
= J. 7 Fa (E.19) 
Thus 1% 
= ikax 
f(x) = = 1 g(k)e™” dk (E.20) 
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and similarly from (E.14) 


g(k) = T f(ax)e** da. (E.21) 


These are the Fourier transform relations, and they lead us to an important 
representation of the Dirac 6-function. 
Substitute g(k) from (E.21) into (E.20) to obtain 


f(z) = — I T ake / T da eke! f(a’). (E.22) 


Reordering the integrals, we arrive at the result 


f(z) = f E aren = / i eee) a) (E.23) 


valid for any function f(x). Thus the expression 


co 
1 f dka- gp (E.24) 
QT fee 
has the remarkable property of vanishing everywhere except at x = 2’, and 
its integral with respect to x’ over any interval including x is unity (set f = 1 
in (E.23)). In other words, (E.24) provides us with a new representation of 
the Dirac 6-function: 


(E.25) 


Equation (E.25) is very important. It is the representation of the ô- 
function which is most commonly used, and it occurs throughout this book. 
Note that if we replace the upper and lower limits of integration in (E.25) by 
N and —N, and consider the limit N — oo, we obtain exactly (E.10). 

The integral in (E.25) represents the superposition, with identical uni- 
form weight (27)~1, of plane waves of all wavenumbers. Physically it may 
be thought of (cf (E.20)) as the Fourier transform of unity. Equation (E.25) 
asserts that the contributions from all these waves cancel completely, unless 
the phase parameter x is zero — in which case the integral manifestly diverges 
and ‘5(0) is infinity’ as expected. The fact that the Fourier transform of a 
constant is a d-function is an extreme case of the bandwidth theorem from 
Fourier transform theory, which states that if the (suitably defined) ‘spread’ in 
a function g(k) is Ak, and that of its transform f(x) is Ax, then AxAk > 5. 
In the present case Ak is tending to infinity and Az to zero. 

One very common use of (E.25) refers to the normalization of plane-wave 
states. If we rewrite it in the form 


; co e-ik'z akr 
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we can interpret it to mean that the wavefunctions e'** /(27)!/? and e*'* / (27r)? 
are orthogonal on the real axis —co < x < oo for k Æ k’ (since the left-hand 
side is zero), while for k = k’ their overlap is infinite, in such a way that the 
integral of this overlap is unity. This is the continuum analogue of orthonor- 
mality for wavefunctions labelled by a discrete index, as in (E.12). We say that 
the plane waves in (E.26) are ‘normalized to a d-function’. There is, however, 
a problem with this: plane waves are not square integrable and thus do not 
strictly belong to a Hilbert space. Mathematical physicists concerned with 
such matters have managed to deal with this by introducing ‘rigged’ Hilbert 
spaces in which such a normalization is legitimate. Although we often, in the 
text, appear to be using ‘box normalization’ (i.e. restricting space to a finite 
volume V), in practice when we evaluate integrals over plane waves the limits 
will be extended to infinity, and results like (E.26) will be used repeatedly. 
Important three- and four-dimensional generalizations of (E.25) are: 


J dk 43 = (27)36(ax) (E.27) 
and 

J e2 d4k = (27)46(a) (E.28) 
where k- x = k°x° — k - x (see appendix D) and ô(x) = 6(x°)6(a). 
Properties of the -function 


The basic properties of the ð-function are exemplified by the equations (see 
(E.5) and (E.6)) 


L. ô(x—a)dr=1, d(a—a)=0 fora Fa, (E.29) 
where a is any real number; and 
T f(x) 0(a — a) dz = f(a), (E.30) 
where f(x) is any continuous function of x. Other useful properties follow: 
(i) eae G] (E.31) 
Proof 
For a > 0, v v Pr 
J. d(ax) da = i sy = L, (E.32) 
for a < 0, 


a ‘aide i. i(y) S2 = T sw) - a (E.33) 


a 


—co co —Cco 
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(ii) d(a) = 6(—2) i.e. an even function. (E.34) 


Proof 


f (0) = f ste) F(0) dx. (E.35) 
If f(x) is an odd function, f(0) = 0. Thus 6(2) must be an even function. 
oe 1 
(iii) d(f(x)) = À, aia” — aj) (E.36) 
where a; are the roots of f(x) = 0. 


Proof 


The 6-function is only non-zero when its argument vanishes. Thus we are 
concerned with the roots of f(x) = 0. In the vicinity of a root 


f(ai) =0 (E.37) 
we can make a Taylor expansion 


f(x) = fap + (x — ai) S Poi (E.38) 


Thus the ô-function has non-zero contributions from each of the roots a; of 


the form 
O(f(«)) = > [e — aj) (£) E l : (E.39) 


Hence (using property (i)) we have 


1 
d(f(x)) = 2 iam — aj). (E.40) 
Consider the example 
ôl? — a°). (E.41) 
Thus 
f(a) =2? — a = (x — a)(x + a) (E.42) 
with two roots z = +a (a > 0), and df/dx = 2x. Hence 
6(a? — a?) = 5 lla — a) + öle + a)]. (E.43) 
a 
(iv) xô(x) = 0. (E.44) 


This is to be understood as always occurring under an integral. It is obvious 
from the definition or from property (ii). 
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(v) [$s @ae=-7'0) 
where 
(e) = Ža) 
~ da 
Proof 
J f(a) (x)dz = — f'(x)5(x) dx + [f (x) 6(x)|™.. 
= -f'(0) 
since the second term vanishes. 
(vi) fx 5(x' — a) dx’ = (x — a) 
where PE r 
= r< 

a(x) = {1 for x > 0 
is the so-called ‘@-function’. 
Proof 
For x > a, 

/ J(a —a =k 
for x <a, 

I d(a’ — a) da’ = 0 
By a simple extension it is easy to prove the result 

A (x — a) da = (z2 — a) — b (xı — a). 
(vii) læ — y) (a — 2) = 5(a — y) dy — 2). 
Proof 
Take any continuous function of z, f(z). Then 
fa! f(z) dz{ô(æ — y) de — 2)} = f(a) êle - y) 
= fly) ew =f feaefale-w iw- 2) 


(E.45) 


(E.46) 


(E.47) 


(E.48) 


(E.49) 


(E.50) 


(E.51) 


(E.52) 


(E.53) 


(E.54) 


(E.55) 


Thus the two sides of (vii) are equivalent as factors in an integrand with z as 


the integration variable. 
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Exercise 


Use property (iii) plus the definition of the @-function to perform the p° inte- 
gration and prove the useful phase space formula 


[atvow’ —m?)6(p°) = [epee E.56) 
where 
P =(P? -p E.57) 
and 
E = +p? + m?) ?. E.58) 


The relation (E.51) shows that the expression d°p/2E is Lorentz invariant: 
on the left-hand side, dtp and 6(p? — m?) are invariant, while 6(p°) depends 
only on the sign of p°, which cannot be changed by a ‘proper’ Lorentz trans- 
formation — that is, one that does not reverse the sense of time. 
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Contour Integration 


We begin by recalling some relevant results from the calculus of real functions 
of two real variables x and y, which we shall phrase in ‘physical’ terms. Con- 
sider a particle moving in the xy-plane subject to a force F = (P(x,y), Q(x, y)) 
whose x- and y-components P and Q vary throughout the plane. Suppose the 
particle moves, under the action of the force, around a closed path C in the 
xy-plane. Then the total work done by the force on the particle, We, will be 
given by the integral 


We = $ F-dr = $ Pdr +Qay (F.1) 
C C 


where the $ sign means that the integration path is closed. Using Stokes’ 
theorem, we can rewrite (F.1) as a surface integral 


We = curl F -dS (F.2) 
S 


where S is any surface bounded by C (as a butterfly net is bounded by the rim). 
Taking S to be the area in the xy-plane enclosed by C, we have dS = dz dy k 


e we ff (22-2) zas Fs) 


A mathematically special, but physically common, case is that in which F 
is a ‘conservative force’, derivable from a potential function V(x, y) (in this 
two-dimensional example) such that 


OV OV 
P(z,y)=-Z- = and Q(z,y) =- (F.4) 
Ox Oy 
the minus signs being the usual convention. In that case, it is clear that 
OP OQ 
aaa F.5 
Oy Ox (k5) 


and hence Wc in (F.3) is zero. The condition (F.5) is, in fact, both necessary 
and sufficient for We = 0. 
There can, however, be surprises. Consider, for example, the potential 


V(a,y) = —tan7! y/z. (F.6) 
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In this case the components of the associated force are 


pac ee, g sa Se (F.7) 
ðr a + y? ðy r? Fy? 
Let us calculate the work done by this force in the case that C is the circle 
of unit radius centred on the origin, traversed in the anticlockwise sense. We 
may parametrize a point on this circle by (x = cos 0, y = sin 0), so that (F.1) 
becomes 


We = f — sin 6(— sin 0 d0) + cos (cos 0 d0) = f dé = 2r (F.8) 
a c 


a result which is plainly different from zero. The reason is that although this 
force is (minus) the gradient of a potential, the latter is not single-valued, in 
the sense that it does not return to its original value after a circuit round the 
origin. Indeed, the V of (F.6) is just —0, which changes by —27 on such a 
circuit, exactly as calculated in (F.8) allowing for the minus signs in (F.4). 
Alternatively, we may suspect that the trouble has to do with the ‘blow up’ 
of the integrand of (F.7) at the point x = y = 0, which is also true. 

Much of the foregoing has direct parallels within the theory of functions 
of a complex variable z = x + iy, to which we now give a brief and informal 
introduction, limiting ourselves to the minimum required in the text!. The 
crucial property, to which all the results we need are related, is analyticity. A 
function f(z) is analytic in a region R of the complex plane if it has a unique 
derivative at every point of R. The derivative at a point z is defined by the 
natural generalization of the real variable definition: 

aa So (F.9) 

dz Az50 Az 
The crucial new feature in the complex case, however, is that ‘Az’ is actually 
an (infinitesimal) vector, in the xy (Argand) plane. Thus we may immedi- 
ately ask: along which of the infinitely many possible directions of Az are we 
supposed to approach the point z in (F.9)? The answer is: along any! This is 
the force of the word ‘unique’ in the definition of analyticity, and it is a very 
powerful requirement. 

Let f(z) be an analytic function of z in some region R, and let u and v 
be the real and imaginary parts of f: f = u+iv, where u and v are each 
functions of x and y. Let us evaluate df/dz at the point z = x + iy in two 
different ways, which must be equivalent. 

(i) By considering Az = Az (i.e. Ay = 0). In this case 


— = lim a ooo 
dz Az—0 Aa 
ðu ðv 
= On + om (F.10) 


from the definition of a partial derivative. 


lFor a fuller introduction, see for example Boas (1983, chapter 14). 
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(ii) By considering Az = iAy (i.e. Ax = 0). In this case 


df _ i {Seki Se ut + iv(x,y + Ay) a 
— = 1m —_ hh SA 
dz Ay>0 i^y 
ðv ðu 
= i (F.11) 


Equating (F.10) and (F.11) we obtain the Cauchy-Reimann (CR) relations 
ðu Ov Ou Ov 


ia — =- F.12 
Ox Oy Oy Ox ( ) 
which are the necessary and sufficient conditions for f to be analytic. 
Consider now an integral of the form 
I= $ f(e)de (F.13) 
c 


where again the symbol $ means that the integration path (or contour) in the 
complex plane in closed. Inserting f = u + iv and z = x + iy, we may write 
(F.13) as 


I= (ude -vdy) +i wde + uay). (F.14) 


Thus the single complex integral (F.13) is equivalent to the two real-plane 
integrals (F.14); one is the real part of I, the other is the imaginary part, 
and each is of the form (F.1). In the first, we have P = u,Q = —v. Hence 
the condition (F.5) for the integral to vanish is Ou/Oy = —Ov/0x, which is 
precisely the second CR relation! Similarly, in the second integral in (F.14) 
we have P = v and Q = u so that condition (F.5) becomes 0u/Oy = Ou/Oz, 
which is the first CR relation. It follows that if f(z) is analytic inside and on 
C, then 


fioa os 
Cc 


a result known as Cauchy’s theorem, the foundation of complex integral cal- 
culus. 

Now let us consider a simple case in which (as in (F.7)) the result of 
integrating a complex function around a closed curve is not zero — namely the 
integral 

dz 


Cc Z 


(F.16) 


where C is the circle of radius p enclosing the origin. On this circle, z = pet? 
where p is fixed and 0 < 0 < 27, so 


+ ,i0 
f Ta ¢ ie = ip a = 2ri. (F.17) 
oF c pe 
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Cauchy’s theorem does not apply in this case because the function being 
integrated (z~') is not analytic at z = 0. Writing dz/z in terms of x and y 
we have 


dz da+idy (a — iy) . 
= a ee 3 (da + idy) 
xdr + ydy , [{ —ydz + zdy 
= pe ma y? ) +1 A T y? 7 (F.18) 


The reader will recognize the imaginary part of (F.18) as involving precisely 
the functions (F.7) studied earlier, and may like to find the real potential 
function appropriate to the real part of (F.18). 

We note that the result (F.17) is independent of the circle’s radius p. This 
means that we can shrink or expand the circle how we like, without affecting 
the answer. The reader may like to show that the circle can, in fact, be dis- 
torted into a simple closed loop of any shape, enclosing z = 0, and the answer 
will still be 27i. In general, a contour may be freely distorted in any region 
in which the integrand is analytic. 

We are now in a position to prove the main integration formula we need, 
which is Cauchy’s integral formula: let f(z) be analytic inside and on a simple 
closed curve C which encloses the point z = a; then 


$ Le) dz = 2nif (a) (F.19) 


z—a 


where it is understood that C is traversed in an anticlockwise sense around 
z =a. The proof follows. The integrand in (F.19) is analytic inside and on C, 
except at z = a; we may therefore distort the contour C by shrinking it into a 
very small circle of fixed radius p around the point z = a. On this circle, z is 
given by z = a + pe’, and 
an ið \ aib 2T 
ci bl ca do= | fla+pe)idd.  (F.20) 
cz—a 0 pe 0 

Now, since f is analytic at z = a, it has a unique derivative there, and is 
consequently continuous at z = a. We may then take the limit p — 0 in 
(F.20), obtaining lim, f(a + pet?) = f(a), and hence 


2r 
Fial dz = fa) f id = 2rif (a) (F.21) 
c#—a 0 
as stated. 
We now use these results to establish the representation of the -function 
(see (E.47)) quoted in section 6.3.2. Consider the function F(t) of the real 
variable t defined by 


Oe f Ooi (F.22) 


2m Je=c1+C, Z + ic 
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FIGURE F.1 
Contours for F(t): (a) t < 0; (b) t > 0. 


where e is an infinitesimally small positive number (i.e. it will tend to zero 
through positive values). The closed contour C is made up of Cı which is the 
real axis from —R to R (we shall let R — oo at the end), and of C2 which is 
a large semicircle of radius R with diameter the real axis, in either the upper 
or lower half-plane, the choice being determined by the sign of t, as we shall 
now explain (see figure F.1). Suppose first that t < 0, and let z on C2 be 
parametrized as z = Rei? = Rcosé6 +iRsin0. Then 


eit — elt] — e7 Psin 6|t| if cos 6|t| (F.23) 


from which it follows that the contribution to (F.22) from C2 will vanish 
exponentially as R —> oo provided that 0 > 0, i.e. we choose Cz to be in 
the upper half-plane (figure F.1(a)). In that case the integrand of (F.22) is 
analytic inside and on C (the only non-analytic point is outside C at z = —ie) 
and so 

F(t)=0 for t < 0. (F.24) 


However, suppose t > 0. Then 

e itt = eP sin 6t .—if cos Ot (F.25) 
and in this case we must choose the ‘contour-closing’ Cz to be in the lower 
half-plane (0 < 0) or else (F.25) will diverge exponentially as R > oo. With 
this choice the Cz contribution will again go to zero as R — oo. However, 


this time the whole closed contour C does enclose the point z = —ie (see 
figure F.1(b)), and we may apply Cauchy’s integral formula to get, for t > 0, 


F(t) = —27i—e-*, (F.26) 
20 


the minus sign at the front arising from the fact (see figure F.1(b)) that C is 
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now being traversed in a clockwise sense around z = —ie (this just inverts the 
limits in (F.21)). Thus as € > 0, 


F(t)1 fort >0. (F.27) 


Summarizing these manoeuvres, for t < 0 we chose Cə in (F.22) in the upper 
half-plane (figure F.1(a)), and its contribution vanished as R — oo. In this 
case we have, as R —> ov, 


: oo —izt 
F(t) > = | ? dz=0 fort <0. (F.28) 
T 


ae. 2 Pile 


For t > 0 we chose C2 in the lower half-plane (figure F.1(b)), when again its 
contribution vanished as R — oo. However, in this case F does not vanish, 
but instead we have, as R —> ov, 


è oo —izt 
F) —> x | ? dz=1 ~~ fort >0. (F.29) 
TT 


zos TIE 


Equations (F.28) and (F.29) show that we may indeed write 


: co —izt 
6(t) = tim f L2 a (F.30) 


e027 J- z Fie 


as claimed in section 6.3, equation (6.93). 


G 


Green Functions 


Let us start with a simple but important example. We seek the solution Go(r) 
of the equation 
V’°Go(r) = ô(r). (G.1) 


There is a ‘physical’ way to look at this equation which will give us the answer 
straightaway. Recall that Gauss’ law in electrostatics (appendix C) is 


V- E=p/e (G.2) 


and that E is expressed in terms of the electrostatic potential V as E = —VV. 
Then (G.2) becomes 
V?V =—p/e0 (G.3) 


which is known as Poisson’s equation. Comparing (G.3) and (G.1), we see 
that (—Go(r)/eo) can be regarded as the ‘potential’ due to a source p which 
is concentrated entirely at the origin, and whose total ‘charge’ is unity, since 
(see appendix E) 


fso) dr = 1. (G.4) 


In other words, (—Go/€o) is effectively the potential due to a unit point charge 
at the origin. But we know exactly what this potential is from Coulomb’s law, 


namely 
—Go(r) 1 


o ee a 
whence 1 
Go(r) = -7 (G.6) 
Tr 


We may also check this result mathematically as follows. Using (G.6), 
equation (G.1) is equivalent to 


1 
V? = —4rô(r). (G.7) 
r 
Let us consider the integral of both sides of this equation over a spherical 


volume of arbitrary radius R surrounding the origin. The integral of the 
left-hand side becomes, using Gauss’ divergence theorem, 


f (v=) dr = Vv. (~+) dr = v (+) -ndS. (G.8) 
V r V T S bounding V r 
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on the surface S, while m = ê and dS = R? dQ with dQ the element of solid 
angle on the sphere. So 


[v (+) d?r = - fan = —4r (G.9) 


which using (G.4) is precisely the integral of the right-hand side of (G.7), as 
required. 
Consider now the solutions of 


(V? + k7)Gy(r) = ô(r). (G.10) 


We are interested in rotationally invariant solutions, for which G; is a function 
of r = |r| alone. For r 4 0, equation (G.10) is easy to solve. Setting G(r) = 
f(r)/r, and using 


a: 1b 0 0 7 o o 
Vf = 25," Dr + parts depending on 0 and T 
we find that f(r) satisfies 
a k’f=0 
dr? E 


the general solution to which is (k = |k]) 
f(r) = Aeikr + Beir, 


leading to 


(G.11) 


for r # 0. In the application to scattering problems (appendix H) we shall 
want Gk to contain purely outgoing waves, so we will pick the ‘A’-type solution 
in (G.11). 

Consider therefore the expression 


(V? +k?) (5) (G.12) 
where r is now allowed to take the value zero. Making use of the vector 
operator result 

V? (Fg) = (V° F) +27 f- Vg + f(V’9) 
with ‘f’ = e!*" and ‘g’ = 1/r, together with 
ikrek” 1 Tr 


dik ikr 7 , 
Ae Z L kakr Jek" — —— To Se 


2 | 
Vv eikr = - 
r T r r 
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we find 


akr : 
wae) (2 ) = Ackry? (=) 


ir r 
= —4rAe™ e(r) 
= —47Ad(r) (G.13) 


where we have replaced r by zero in the exponent of the last term of the 
last line in (G.13), since the d-function ensures that only this point need be 
considered for this term. By choosing the constant A = —1/47, we find that 
the (outgoing wave) solution of (G.10) is 


eikr 


Gr) =- 


(G.14) 


Arr’ 


We are also interested in spherically symmetric solutions of (restoring c 
and fi explicitly for the moment) 


22 


(v = a ) olr) = d(r) (G.15) 


which is the equation analogous to (G.1) for a static classical scalar potential 
of a field whose quanta have mass m. The solutions to (G.15) are easily found 
from the previous work by letting k + imc/h. Retaining now the solution 
which goes to zero as r — oo, we find 


O(n) =-—— (G.16) 


where a = h/mc, the Compton wavelength of the quantum, with mass m. The 
potential (G.16) is (up to numerical constants) the famous Yukawa potential, 
in which the quantity ‘a’ is called the range: as r gets greater than a, ¢(r) 
becomes exponentially small. Thus, just as the Coulomb potential is the solu- 
tion of Poisson’s equation (G.3) corresponding to a point source at the origin, 
so the Yukawa potential is the solution of the analogous equation (G.15), also 
with a point source at the origin. Note that as a > oo, ¢(r) > Go(r). 

Functions such as Gk, Go and ¢, which generically satisfy equations of the 
form 


G(r) = 5(r) (G.17) 


where Q, is some linear differential operator, are said to be Green functions of 
the operator 2,. From the examples already treated, it is clear that G(r) in 
(G.17) has the general interpretation of a ‘potential’ due to a point source at 
the origin, when Q, is the appropriate operator for the field theory in question. 
Green functions play an important role in the solution of differential equa- 

tions of the type 
Q,w(r) = s(r) (G.18) 


396 G. Green Functions 


where s(r) is a known ‘source function’ (e.g. the charge density in (G.3)). 
The solution of (G.18) may be written as 


r)+ | ae- rs r’) dr’ (G.19) 


where u(r) is a solution of 2,u(r) = 0. Thus once we know G, we have the 
solution via (G.19). 

Equation (G.19) has a simple physical interpretation. We know that G(r) 
is the solution of (G.18) with s(r) replaced by d(r). But by writing 


r)= fa (r —r’)s(r’) der’ (G.20) 


we can formally regard s(r) as being made up of a superposition of point 
sources, distributed at points r’ with a weighting function s(r’). Then, since 
the operator Q, is (by assumption) linear, the solution for such a superposi- 
tion of point sources must be just the same superposition of the point source 
solutions, namely the integral on the right-hand side of (G.19). This integral 
term is, in fact, the ‘particular integral’ of the differential equation (G.18), 
while the u(r) is the ‘complementary function’. 

Equation (G.19) can also be checked analytically. First note that it is 
generally the case that the operator Q, is translationally invariant, so that 


Qe = Ver} (G.21) 


the right-hand side of (G.21) amounts to shifting the origin to the point r’. 
Applying Q, to both sides of (G.19), we find 


QW(r) = u(r) + | YG- r')s(r’) a 


0+ [2 -nGlr -rsr Nabe = fir-r) N der’ 
= s(r) 


as required in (G.18). 
Finally, consider the Fourier transform of equation (G.10), defined as 


JIG) ar = f(r) ar 


The right-hand side is unity, by equation (G.4). On the left-hand side we may 
use the result 


Juovat) dr = [Pune dr 


(proved by integrating by parts, assuming u and v go to zero sufficiently fast 
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at the boundaries of the integral) to obtain 
J eIT (V? + k’)Ggr(r)dr = J {(V eiT") + ke IG, (r) d?r 
7 [ce +k’ )e"9" G(r) dr 


(—q? + k*)G;.(q) 


where G;,(q) is the Fourier transform of G;,(r). Since this expression has to 


equal unity, we have 
x 1 
Gi(q) = Pog 


There is, however, a problem with (G.22) as it stands, which is that it is 
undefined when the variable q? takes the value equal to the parameter k? in 
the original equation. Indeed, various definitions are possible, corresponding 
to the type of solution in r-space for G(r) (i.e. ingoing, outgoing or standing 
wave). It turns out (see the exercise at the end of this appendix) that the 


(G.22) 


specification which is equivalent to the solution G(r) in (G.14) is to add 
an infinitesimally small imaginary part in the denominator of (G.22): 


1 


=. G.23 
k? — q? + ie ( ) 


GP (q) = 


In exactly the same way, the Fourier transform of ¢(r) satisfying (G.15) is 


, —1 
= =; G.24 
o= gpm (G.24) 
where we have reverted to units such that A = c = 1. 
The relativistic generalization of this result is straightforward. Consider 
the equation 


(0 + m?)G(x) = —d(z) (G.25) 


where x is the coordinate 4-vector and 6(x) is the four-dimensional 5-function, 
5(x°)6(a); the sign in (G.25) has been chosen to be consistent with (G.15) in 
the static case. Taking the four-dimensional Fourier transform, and making 
suitable assumptions about the vanishing of G at the boundary of space-time, 
we obtain 


(—q? + m?)G(q) = -1 (G.26) 
where 
G(q) = J et? G(2) dts 
and so 1 
G(q) = Pom (G.27) 


As we have seen in detail in chapter 6, the Feynman prescription for selecting 
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the physically desired solution amounts to adding an ‘ie’ term in the denomi- 


nator of (G.27): 
1 


GH (q) = =. 
(4) Pa 


(G.28) 


Exercise 


Verify the ‘ie’ specification in (G.23), using the methods of appendix F. [ Hint: 
You need to show that the Fourier transform of (G.23), defined by 


1 
(anys 


is equal to Gh) (r) of (G.14). Do the integration over the polar angles of q, 
taking the direction of r as the polar axis. This gives 


Š —1 f® (ev — eT qdq 
eMm=— (S ea) 


~ 8r2 J% q — k? —ie 


CM (r) = J ea GH (q) dq, (G.29) 


where g = |q|, r = |r|, and we have used the fact that the integrand is an even 
function of q to extend the lower limit to —oo, with an overall factor of 1/2. 
Now convert q to the complex variable z. Locate the poles of (z? — k? —ie)~+ 
(compare the similar calculation in section 10.3.1, and in appendix F). Apply 
Cauchy’s integral formula (F.17), closing the e!*” part in the upper half z- 
plane, and the e~!*" part in the lower half z-plane.] 


H 


Elements of Non-relativistic Scattering 
Theory 


H.1 Time-independent formulation and differential cross 
section 


We consider the scattering of a particle of mass m by a fixed spherically 
symmetric potential V(r); we shall retain A explicitly in what follows. The 
potential is assumed to go to zero rapidly as r — ov, as for the Yukawa 
potential (G.16); it will turn out that the important Coulomb case can be 
treated as the a + oo limit of (G.16). We shall treat the problem here as a 
stationary state one, in which the Schrödinger wavefunction w(r,t) has the 
form 


(r,t) = o(r)e*™ (H.1) 


where FE is the particle’s energy, and where ¢(r) satisfies the equation 
-R 


We shall take V to be spherically symmetric, so that V(r) = V(r) where 
r= |r|. In this approach to scattering, we suppose the potential to be ‘bathed’ 
in a steady flux of incident particles, all of energy E. The wavefunction for 
the incident beam, far from the region near the origin where V is appreciably 
non-zero, is then just a plane wave of the form @¢ine = elk? where the z-axis 
has been chosen along the propagation direction, and where E = h?k? /2m 
with k = (0,0,k). This plane wave is normalized to one particle per unit 
volume, and yields a steady-state flux of 


Jince = Imi [Pine V Pinc = inc V Pine] 
mi 
hk/m = p/m (H.3) 


where the momentum is p = hk. As expected, the incident flux is given by 
the velocity v per unit volume. 

Though we have represented the incident beam as a plane wave, it will, 
in practice, be collimated. We could, of course, superpose such plane waves, 


399 


400 H. Elements of Non-relativistic Scattering Theory 


with different k’s, to make a wave-packet of any desired localization. But 
the dimensions of practical beams are so much greater than the de Broglie 
wavelength \ = h/p of our particles, that our plane wave will be a very good 
approximation to a realistic packet. 

The form of the complete solution to (H.2), even in the region where V is 
essentially zero, is not simply the incident plane wave, however. The presence 
of the potential gives rise also to a scattered wave, whose form as r — oo is 


eikr 


r 

We shall actually derive this later, but its physical interpretation is simply 
that it is an outgoing (~e!*" rather than e~*") ‘spherical wave’, with a factor 
f (0, 6) called the scattering amplitude that allows for the fact that even though 
V(r) is spherically symmetric, the solution, in general, will not be (recall 
the bound-state solutions of the Coulomb potential in the hydrogen atom). 
Calculating the radial component of the flux corresponding to (H.4) yields 


h fð a. 
Irae = Fe Pse Pse — Pse Pac 


2mi 
ETOO (H5) 


The flux in the two non-radial directions will contain an extra power of r in 
the denominator — recall that 
~1 ð ~ 1 ð 


0 
veta ra rand Oo 


and so (H.5) represents the correct asymptotic form of the scattered flux. 
The cross section is now easily found. The differential cross section, dø, 
for scattering into the element of solid angle dQ is defined by 


do = jrsc dS/|jincl (H.6) 
where dS = r? dQ, so that from (H.3) and (H.5) 


do 2 
aq = AOA. Hd) 


The total cross section is then just 


o= | IODA (H.8) 


It is important to realize that the complete asymptotic form of the solution 
to (H.2) is the superposition of ding and Qsc: 
ikr 


o(r) "Fe + f(O, p) 


(H.9) 
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Note that in the ‘forward direction’ (i.e. within a region close to the z-axis, as 
determined by the collimation), the incident and scattered waves will inter- 
fere. Careful analysis reveals a depletion of the incident beam in the forward 
direction (the ‘shadow’ of the scattering centre), which corresponds exactly 
to the total flux scattered into all angles (Gottfried 1966, section 12.3). This 
is expressed in the optical theorem: 


Im f(0) = Ko. (H.10) 


EE: SeSe 


H.2 Expression for the scattering amplitude: Born 
approximation 


We begin by rewriting (H.2) as 


(V? + )o(r) = ZZV (Poin). (11.11) 


This equation is of exactly the form discussed in appendix G, e.g. equa- 
tion (G.18) with Q. = V? +k?. Further, we know that the Green function 
for this Q,, corresponding to the desired outgoing wave solution, is given by 
(G.14). Using then (G.19) and (G.14), we can immediately write the ‘formal 
solution’ of (H.11) as 


ker Den 1 e@alr-r'| 


vw) & poe eee aia 


o(r) =e 
where we have chosen ‘u(r)’ in (G.19) to be the incident plane wave ¢inc, and 
have used k -r = kz. We say ‘formal’ because of course the unknown ¢(r’) 
still appears on the right-hand side of (H.12). 

It may therefore seem that we have made no progress — but in fact (H.12) 
leads to a very useful expression for f(0,), which is the quantity we need to 
calculate. This can be found by considering the asymptotic (r — 00) limit of 
the integral term in (H.12). We have 


lr—r'| = (r? +r? — 2r. r)? 


~ r=r-r'/r+0 G) terms. (H.13) 
r 


Thus in the exponent we may write 


ihl Ppr! . _p.p’ ih ikp’ 
ekIT T| eik(r T-T'/r) ekre ikr 


where k’ = k? is the outgoing wavevector, pointing along the direction of the 
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outgoing scattered wave which enters dS. In the denominator factor we may 
simply say |r — r’|~! l since the next term in (H.13) will produce a 
correction of order r~?. Putting this together, we have 


xro 


ikr 
TOO ikz o m e ~ik’-r’ 1 1\ 43,,/ 
olr) > e =e Je V(r')o(r') dr (H.14) 


from which follows the formula for f(6, ¢): 


m 
Qrh2 


f(0,9) = eR TV (n\b(r’) ar. (H.15) 

No approximations have been made thus far, in deriving (H.15) — but of 
course it still involves the unknown ¢(r’) inside the integral. However, it is 
in a form which is very convenient for setting up a systematic approximation 
scheme — a kind of perturbation theory — in powers of V. If the potential is 
relatively ‘weak’, its effect will be such as to produce only a slight distortion of 
the incident wave, and so ¢(r) ~ cik-T + ‘small correction’. This suggests that 
it may be a good approximation to replace ¢(r’) in (H.15) by the undistorted 
ik-r’ 


incident wave e , giving the approximate scattering amplitude 


fea(9,¢) =— feve) d?r’ (H.16) 


m 
Qrh? 
where the wave vector transfer q is given by 

q=k-k’. (H.17) 


This is called the ‘Born approximation to the scattering amplitude’. The 
criteria for the validity of the Born approximation are discussed in many 
standard quantum mechanics texts. 

The approximation can be improved by returning to (H.12) for ¢(r), and 
replacing ¢(r’) inside the integral by ekr just as we did in (H.16); this will 
give us a formula for the first-order (in V) correction to ¢(r). We can now 
insert this expression for ¢(r') (i.e. ọ(r') = e&*T’ + O(V) correction) into 
(H.15), which will give us fga again as the first term, but also another term, 
of order V? (since V appears in the integral in (H.15)). By iterating the 
process indefinitely, the Born series can be set up, to all orders in V. 


E: SSe 


H.3 Time-dependent approach 


In this approach we consider the potential V(r) as causing transitions be- 
tween states describing the incident and scattered particles. From standard 
time-dependent perturbation theory in quantum mechanics, the transition 


H.3. Time-dependent approach 403 


probability per unit time for going from state |i) to state |f), to first order in 
V, is given by 


: 27 . 
Py = = EVID Er) laa (H.18) 


where p(E¢)dEp is the number of final states in the energy range dE; around 
the energy-conserving point F; = Ep. Equation (H.18) is often known as 
the ‘Golden Rule’. In the present case, if we adopt the same normalization 
as in the previous section, the initial and final states are represented by the 


wavefunction e*-T and e` fT, so that 
(f|V |i) = fever =V(q). (H.19) 


Also, the number of such states in a volume element dp’ of momentum space 
(p! = hk’) is d’p'/(2rħ)’. 

In spherical polar coordinates, with dQ standing for the element of solid 
angle around the direction (0, /¢) of p’, we have 


dp! = p° d|p'|dQ = m|p'| dE’ dQ (H.20) 


where we have used E’ = p'”/2m. It follows that 


dp! m 
E') dE’ = = | dQ dE’ H.21 
p(E')d (nh)? AF |p'|dQd (H.21) 
and so m 
Pn 1 
Inserting (H.19) and (H.22) into (H.18) we obtain, for this case, 
: QT = m 
Pi = — ? —— |p| dQ. H.2 
a= FIV @P tld (H.23) 


To get the cross section, we need to divide this expression by the incident flux, 
which is |p|/m as in (H.3). Thus the differential cross section for scattering 
into the element of solid angle dQ in the direction (0, ¢) is 


do = (45) \7(q)|2 aa. (H.24) 


Comparing (H.24) with (H.7) and (H.16), we see that this application of the 
Golden Rule (first-order time-dependent perturbation theory) is exactly equiv- 
alent to the Born approximation in the time-independent approach. It is, how- 
ever, the time-dependent approach which is much closer to the corresponding 
quantum field theory formulation we introduce in chapter 6. 
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The Schrodinger and Heisenberg Pictures 


The standard introductory formalism of quantum mechanics is that of Schro- 
dinger, in which the dynamical variables (such as æ and p = —iV) are inde- 
pendent of time, while the wavefunction Y changes with time according to the 
general equation 


yle, t) = E (1.1) 


where H is the Hamiltonian. Matrix elements of operators Â depending on 
x,p... then have the form 


(oläy) = J 6° (æ, HAY (a, t) a (12) 


and will, in general, depend on time via the time dependences of ¢ and w. 
Although used almost universally in introductory courses on quantum me- 
chanics, this formulation is not the only possible one, nor is it always the 
most convenient. 

We may, for example, wish to bring out similarities (and differences) be- 
tween the general dynamical frameworks of quantum and classical mechanics. 
The formulation here does not seem to be well adapted to this purpose, since 
in the classical case the dynamical variables depend on time (a(t), p(t) ...) 
and obey equations of motion, while the quantum variables A are time- 
independent and the ‘equation of motion’ (I.1) is for the wavefunction w, 
which has no classical counterpart. In quantum mechanics, however, it is 
always possible to make unitary transformations of the state vector or wave- 
functions. We can make use of this possibility to obtain an alternative for- 
mulation of quantum mechanics, which is in some ways closer to the spirit of 
classical mechanics, as follows. 

Equation (I.1) can be formally solved to give 


W(x, t) = e (a, 0) (1.3) 


where the exponential (of an operator!) can be defined by the corresponding 
power series, for example: 


A A 1 7 
e Ht 1 — ifft + it)” foes, (1.4) 
It is simple to check that (1.3) as defined by (1.4), does satisfy (I.1) and that 
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the operator U = exp(—iHt) is unitary: 
Ut = [exp(—iHt)]' = expli tt) = exp(i#t) = U7} (1.5) 


where the Hermitian property Ht = H has been used. Thus (1.3) can be 
viewed as a unitary transformation from the time-dependent wavefunction 
w(a,t) to the time-independent one w(x,0). Correspondingly the matrix ele- 
ment (1.2) is then 


(6lAly) = f (æ, oe Acy(e@,0) da (16) 
which can be regarded as the matrix element of the time-dependent operator 
A(t) = elt Aci? (1.7) 


between time-independent wavefunctions ¢* (x, 0), y(æ, 0). 

Since (1.6) is perfectly general, it is clear that we can calculate amplitudes 
in quantum mechanics in either of the two ways outlined: (i) by using time- 
dependent w’s and time-independent A’s, which is called the ‘Schrödinger 
picture’: or (ii) by using time-independent w’s and time-dependent A’s, which 
is called the ‘Heisenberg picture’. The wavefunctions and operators in the two 
pictures are related by (1.3) and (1.7). We note that the pictures coincide at 
the (conventionally chosen) time t = 0. 

Since A(t) is now time-dependent, we can ask for its equation of motion. 
Differentiating (1.7) carefully, we find (if A does not depend explicitly on t) 
that 


dÂ) aw A 
= AA] (1.8) 


which is called the Heisenberg equation of motion for A(t). On the right-hand 
side of (1.8), H is the Schrodinger operator; however, if H is substituted for 
A in (L7), one finds H(t) = H, so H can equally well be interpreted as the 
Heisenberg operator. For simple Hamiltonians Ê, (1.8) leads to operator equa- 
tions quite analogous to classical equations of motion, which can sometimes 
be solved explicitly (see section 5.2.2 of chapter 5). 

The foregoing ideas apply equally well to the operators and state vectors 
of quantum field theory. 
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Dirac Algebra and Trace Identities 


J.1 Dirac algebra 


J.1.1 ~y matrices 


The fundamental anticommutator 
{7 y} = 2g” 
may be used to prove the following results. 


Wy = 4 
Wty = — 2d 
Yd by" = 4a-b 
Ytbey! = —2¢bd 
dp = —fd+2a-b. 


As an example, we prove this last result: 
dp = apb yy” 


apb (= + 2g") 
= —bd + 2a-b. 


J.1.2 ~s identities 


Define 


y = yyy. 


In the usual representation with 


o /1 0 _(0 o 
P={o -1 m | a 


ys is the matrix 
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Either from the definition or using this explicit form it is easy to prove that 
ae = 1 (J.10) 
and 
{75,y"} =0 (J.11) 


i.e. ys anticommutes with the other y-matrices. Defining the totally antisym- 
metric tensor 


—1 for an odd permutation of 0, 1, 2, 3 (J.12) 


+1 for an even permutation of 0, 1, 2, 3 
Euvpo = 
0 if two or more indices are the same 


we may write 

i 
~ 2 
With this form it is possible to prove 


%5 Euupo Y Y YoY’ - (J.13) 


I 
Yso = gerro VY Y (J.14) 
and the identity 


Y e = gee grey” ag tye ae yea: (J.15) 


J.1.3 Hermitian conjugate of spinor matrix elements 


[u(p', s’)Tu(p, s)|" = alp, s)Pu(p’, s’) J.16) 
where T is any collection of y matrices and 
Peers. J.17) 
For example 
VF = q” J.18) 
and 
ys = "75. J.19) 


J.1.4 Spin sums and projection operators 
Positive-energy projection operator: 

[A+ (P)los = >) tal, s)ūg(p, 5) = (P+ map. (J.20) 
Negative-energy projection operator: 


[A-(P)las = — >) valp, 8)0a(p, 8) = (—p + m)ag. (J.21) 
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Note that these forms are specific to the normalizations 
tu = 2m vv = —2m (J.22) 


for the spinors. 
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J.2 Trace theorems 


Trl = 4 (theorem 1) (J.23) 
Try, = 0 (theorem 2) (J.24) 
Tr(odd number of y’s) = 0 (theorem 3). (J.25) 
Proof 
Consider 
T=Tr(¢,d.-.-4,,) (J.26) 
where n is odd. Now insert 1 = (y5)? into T, so that 
T = Tr(d f - - - fp 155). (J.27) 
Move the first y5 to the front of T by repeatedly using the result 
dys = —75¢.- (J.28) 
We therefore pick up n minus signs: 
T = Tr(d,...4,) = (—1)°Tr(s¢, t y) 
= (-1)"Tr(¢, ...4,7575) (cyclic property of trace) 
= -Tr(d,...4,) for n odd. (J.29) 
Thus, for n odd, T must vanish: 
Tr(dp) = 4a-b (theorem 4). (J.30) 


Proof 


Tr(dB) = x Tr(db + Bd) 
= 4a by Tr(1.2g””) 
4a: b. 
A[(a- b)(c- d) + (a- d)(b- c) — (a - c) (b - d)]. (theorem 5) 
(J:31) 


II 


Tr(Ab¢d) 
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Proof 


Tr(Ab¢d) = 2(a - b)Tr(¢d) — Tr(ba¢d) (J.32) 


using the result of (J.6). We continue taking ¢ through the trace in this 
manner and use (J.30) to obtain 


Tr(gb¢d) = 2(a - b)4(c- d) — 2(a- c)Tr(bd) + Tr(Be dd) 
= 8(a - b)(c - d) — 8(a- c)(b- d) + 8(b- c)(a- d) — Tr(P¢dd) (J.33) 
and, since we can bring ¢ to the front of the trace, we have proved the theorem. 
Tr[y5¢] = 0. (theorem 6) (J.34) 
This is a special case of theorem 3 since ys; contains four y matrices. 
Tr[ys4$] = 0. (theorem 7) (J.35) 


This is not so obvious; it may be proved by writing out all the possible products 
of y matrices that arise. 


Tr[ys5dp¢] = 0. (theorem 8) (J.36) 
Again this is a special case of theorem 3. 
Tr[qsdb¢d] = dicapysa% bcd’. (theorem 9) (J.37) 


This theorem follows by looking at components: the e tensor just gives the 
correct sign of the permutation. 

The e tensor is the four-dimensional generalization of the three-dimensional 
antisymmetric tensor €;;,. In the three-dimensional case we have the well- 
known results 

(b x Cc); = EijkbjCk (J.38) 


and 
a: (b x c) = EijkaibjCk (J.39) 


for the triple scalar product. 
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Example of a Cross Section Calculation 


In this appendix we outline in more detail the calculation of the e~st elastic 
scattering cross section in section 8.3.2. The standard factors for the unpo- 
larized cross section lead to the expression 


- 1 1 jie IK ot 
az = TEJ? > |Me-s+ (5, 8’)|"dLips(s; k’, p’) (K.1) 
1 1 


= IEP PE Mest (s, s')?dLips(s; k’, p’) (K.2) 


using the result of problem 6.9, and the definition of Lorentz-invariant phase 
space: 
dp’ d3 k! 


: Ee ai= Ac4szs SE = eee 
dLips(s; k’, p ) = (21)"o"(k +p —k—p) (27)32E" (277)82u"" 


(K.3) 


Instead of evaluating the matrix element and phase space integral in the CM 
frame, or writing the result in invariant form, we shall perform the calculation 
entirely in the ‘laboratory’ frame, defined as the frame in which the target (i.e. 
the s-particle) is at rest: 

p” = (M,0) (K.4) 


where M is the s-particle mass. Let us look in some detail at the ‘laboratory’ 
frame kinematics for elastic scattering (figure K.1). Conservation of energy 
and momentum in the form 


2 
p“ = (p+) (K.5) 
allows us to eliminate p’ to obtain the elastic scattering condition 


2p-qt+¢q° =0 (K.6) 


K7 


if we introduce the positive quantity 


Q = -g (K.8) 


or 


for a scattering process. 
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FIGURE K.1 
Laboratory frame kinematics. 


In all the applications with which we are concerned it will be a good 
approximation to neglect electron mass effects for high-energy electrons. We 
therefore set 


kh? =k? ~0 (K.9) 
so that 
s+t+u x~ 2M’ (K.10) 
where 
s = (k+p)=(k +p} (K.11) 
t = (k-k? =p -p= (K.12) 
u = (k—p')? =(k' -— p) (K.13) 


are the usual Mandelstam variables. For the electron 4-vectors 


kt = (w,k) (K.14) 
k” = (iB) (K.15) 


we can neglect the difference between the magnitude of the 3-momentum and 
the energy, 


w x |k|=k (K.16) 
w ~ |k'|=k’ (K.17) 
and in this approximation 
q? = —2kk'(1 — cos 0) (K.18) 
or 
q? = —4kk’ sin? (0/2). (K.19) 


The elastic scattering condition (K.7) gives the following relation between k, k’ 
and 0: 
(k/k') = 1+ (2k/M) sin? (0/2). (K.20) 


It is important to realize that this relation is only true for elastic scattering: 
for inclusive inelastic electron scattering k, k’ and 6 are independent variables. 
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The first element of the cross section, the flux factor, is easy to evaluate: 
Al(k +p)? — m?M?]2 ~ 4Mk (K.21) 


in the approximation of neglecting the electron mass m. We now consider the 
calculation of the spin-averaged matrix element and the phase space integral 
in turn. 


K.1 The spin-averaged squared matrix element 


The Feynman rules for es scattering enable us to write the spin sum in the 
form 


1 4ra\? 
2 D |Me-s+ (s, s‘)|? = (=) Lpp TH” (K.22) 
s,s! q 


where Ly, is the lepton tensor, TH” the s-particle tensor and the one-photon 
exchange approximation has been assumed. From problem 8.12 we find the 
result 

Luv T"” = 8[2(k - p)(k' - p) + (4° /2)M°]. (K.23) 


In the ‘laboratory’ frame, neglecting the electron mass, this becomes 


LyyT*” = 16M°kk' cos? (0/2). (K.24) 


a 


K.2 Evaluation of two-body Lorentz-invariant phase 
space in ‘laboratory’ variables 


We must evaluate 


1 
(47)? 


dp’ d3 k! 
E! w 


dLips(s; k’, p’) = ô (k +p’ —k—p) (K.25) 
in terms of ‘laboratory’ variables. This is in fact rather tricky and requires 


some care. There are several ways it can be done: 


(i) Use CM variables, put the cross section into invariant form, and then 
translate to the ‘laboratory’ frame. This involves relating dq? to 
d(cos 0) which we shall do as an exercise at the end of this appendix. 


(ii) Alternatively, we can work directly in terms of ‘laboratory’ variables 
and write 


3p! /2E' = d*p! 6(p'* — M?)6(p"). (K.26) 
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The four-dimensional 6-function then removes the integration over d*p! 
leaving us only with an integration over the single -function 6(p’” — 
M?), in which p’ is understood to be replaced by k+-p—k’. For details 
of this last integration, see Bjorken and Drell (1964, p 114). 


(iii) We shall evaluate the phase space integral in a more direct manner. We 
begin by performing the integral over d°p’ using the three-dimensional 
6-function from 64(k’ + p' — k — p). In the ‘laboratory’ frame p = 0, 
so we have 


fer F (k +p- k)f(p',k',k) = f(p',k',k)lpok-k (K-27) 


In the particular function f(p',k', k) that we require, p' only appears via F’, 
since 
E? = p° +M? (K.28) 


and 
1? — k? + k’? — 2kk! cos 0 (K.29) 


(setting the electron mass m to zero). We now change dk’ to angular vari- 
ables: 


dk' /w! ~ k'dk'dQ (K.30) 
leading to 
dLips(s; k’, p') = ee dQ dk’ A +k’ —k—M) (K.31) 
R (47)? E! 


Since F’ is a function of k’ and @ for a given k (cf (K.28) and (K.29)), the ô- 
function relates k’ and 6 as required for elastic scattering (cf (K.20)), but until 
the 6 function integration is performed they must be regarded as independent 
variables. We have the integral 


cae | aan’ aW, coso) (K.32) 


where 


f(k! cos 0) = [(k? + k? — 2kk' cos 0) + M7]? +k’ — k- M (K.33) 


remaining to be evaluated. In order to obtain a differential cross section, 
we wish to integrate over k’; for this k’ integration we must regard cos@ in 
f(k’,cos@) as a constant, and use the result (E.36): 


= l ; 
g=F0 
where f(2q) = 0. The required derivative is 
d 
of =F = (EF +k’ — kcos@) (K.35) 


constant cos 0 
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and the 6-function requires that k’ is determined from k and @ by the elastic 
scattering condition 


1 k — 


The integral (K.32) becomes 


1 f k il 
—, | ddk’ — —___—___ 6k’ — k' (cos K.37 
(47)? E' [df /dk'|k'=k' (cos 0) | ( y ( ) 
and, after some juggling, df/dk’ evaluated at k’ = k'(cos 0) may be written 
as 
d Mk 
dk k'=k' (cos 0) E'k 
Thus we obtain finally the result 
ai 1 k’ 
dLi ak = — — K.39 


for two-body elastic scattering in terms of ‘laboratory’ variables, neglecting 
lepton masses. 
Putting all these elements together yields the advertised result 


do\ _ dē a2 Ks 
Gi =40 ropo (K.40) 


As a final twist to this calculation let us consider the change of variables from 
dQ to dq? in this elastic scattering example. In the unpolarized case 


dQ = 27d(cos 8) K.41) 
and 
gq? = —2kk’(1 — cos 0) K.42) 
where 
R= a ren K.43) 
~ 1+ (2k/M) sin?(6/2)’ 
Thus, since k’ and cos@ are not independent variables, we have 
dq? = 2kk’ d(cos 0) + (1 — cos 6)(—24) 2 d(cos 6) (K.44) 
_ = d(cos 0) 
From (K.20) we find 
dk’ Kl? 
= — (K.45) 


d(cos@) M 
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and, after some routine juggling, arrive at the result 

dq? = 2k!” d(cos 0). 
If we introduce the variable v defined, for elastic scattering, by 
2 


2p -q = 2Mv = —q 


we have immediately 


k? 
dv = iva d(cos 0). 
Similarly, if we introduce the variable y defined by 
y=u/k 
we find : 
k! 
W= OkM 


for elastic scattering. 


(K.46) 


(K.47) 


(K.48) 


(K.49) 


(K.50) 


L 


Feynman Rules for Tree Graphs in QED 


2 — 2 cross section formula 


1 
2 =] Fn 2D, DLJ 2 M 2dLi ` ; , $ 
o Al(pi - p2)? — mamae! |*dLips(s; ps, p4) 


1 > 2 decay formula 


1 . 
dP = —|M|?dLips(m?; po, ps). 
2m, 


Note that for two identical particles in the final state an extra factor of Ł 
must be included in these formulae. 

The amplitude iM is the invariant matrix element for the process under 
consideration, and is given by the Feynman rules of the relevant theory. For 
particles with non-zero spin, unpolarized cross sections are formed by averag- 
ing over initial spin components and summing over final. 


E: SeSe 


L.1 External particles 


Spin- 
For each fermion or antifermion line entering the graph, include the spinor 
u(p, s) or u(p, s) (L.1) 
and for spin-4 particles leaving the graph the spinor 
u(p’,s') or U(p’,s’). (L.2) 


Photons 


For each photon line entering the graph include a polarization vector 


eu(k, A) (L.3) 
and for photons leaving the graph the vector 
enh’, XN). (L.4) 
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DT 


L.2 Propagators 


Spin-0 
eaae a (L.5) 
p? — m? + ie 
Spin-4 
; Em 
eE E E aa (L.6) 
p-m p-m’ +ie 
Photon 
i KER” 
= — g + (1 — €) L. 


for a general €. Calculations are usually performed in the Lorentz or Feynman 
gauge with € = 1 and photon propagator equal to 


(—g'”) 
k2 + je° 


i 


(L.8) 


E 


L.3 Vertices 


peT Dar 


Spin-0 


—ie(p+p’), (for charge +e) 


L.3. Vertices 


Spin- 


p p 


—iey, (for charge +e) 
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Plate I 

Distributions of x times the unpolarized parton distribution functions f(x) 
(where f = uy,dy,u,d,s,c,b,g) and their associated uncertainties using the 
MSTW2008 parametrization (Martin et al. 2009) at a scale u2 = 10 GeV? 
and u? = 10,000 GeV. [Figure reproduced courtesy Michael Barnett, for the 
Particle Data Group, from the review of Structure Functions by B F Foster, 
A D Martin and M G Vincter, section 16 in the Review of Particle Physics, 
K Nakamura et al.(Particle Data Group) Journal of Physics G 37 (2010) 
075021, IOP Publishing Limited.] (See figure 9.9 on page 283.) 


V3 [GeV] 


Plate IT 

The cross section o for the annihilation process ete~ — hadrons, and the 
ratio R (see equation (9.100)), as a function of cm energy. [Figure reproduced 
courtesy Michael Barnett, for the Particle Data Group, from the Review of 
Particle Physics, K Nakamura et al. (Particle Data Group) Journal of Physics 
G 37 (2010) 075021 IOP Publishing Limited.] (See figure 9.16 on page 290.) 
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